Clip Man

daniele


Daniel Einspanjer's journal

Data warehousing, ETL, BI, and general hackery


Ubuntu screen-profiles customization
Clip Man
daniele
I recently loaded Ubuntu server 9.04 onto a new machine and encountered Ubuntu's screen-profiles.
In general, I like it. I had one problem and one customization that I wanted to share:

I use Mac OS X's Terminal.app to connect to my remote machines, and by default, it has custom mappings for F1 through F4. I have no idea what those keybindings mean, but they prevent screen-profiles's keybindings from working. It took a little fiddling to figure out how to fix them. Basically, you need to:
  1. Open up the preferences dialog for Terminal.app
  2. Go to the Settings pane
  3. Click on the Keyboard tab button
  4. Edit the action for each of the F1 through F4 keys
  5. When editing, click the "delete one character" button twice to erase the characters currently in there (leave the \033 escape)
  6. Type the following characters: [ 1 1 ~ 11 is F1, 12 is F2, 13 is F3, 14 is F4
  7. The new entries should look just like the F5 through F8 actions.
Once I was able to use the F2 F3 and F4 keys, I decided that they weren't that useful to me. I prefer to use a combination of screen regions and windows. The window commands are very easy for me, but I've always found the split, focus, and remove keybindings to be uncomfortable so I figured those would be great commands to map to F2 F3 and F4. Here is how I did that:
  1. sudo cp /usr/share/screen-profiles/keybindings/common /usr/share/screen-profiles/keybindings/regions
  2. sudo vi /usr/share/screen-profiles/keybindings/regions
  3. replace the first four entries with the new entries below
  4. save and close the file
  5. In screen, hit F9 to bring up the menu
  6. Select the option for "Change keybinding set
  7. Select the new "regions" entry
  8. Hit F5 to reload your screen-profile and pick up the new keybindings.

register n "^aS^a^i^a^c^aA" # | Goes with the F2 definition
bindkey -k k2 process n # F2 | Create new region and window (and name it)
bindkey -k k3 focus # F3 | Next region
bindkey -k k4 remove # F4 | Remove region
Tags: ,

Shell script analytics
Clip Man
daniele
I just made a rather lengthy post on the Mozilla blog of data about shell script analytics.  I'll try hard not to cross post stuff like this too often, but I thought I'd allow myself the spam this time around because using Bash and AWK to do things like this really is an important part of who I am personally as a geek in addition to what I do for Mozilla. :)
Tags: , , ,

I've always thought my job was fun. Now I hear it is sexy too!
Clip Man
daniele
I just finished reading this lovely little post from the company dataspora titled The Three Sexy Skills of Data Geeks.

By far, my favorite quote was, "A good data munger excels at turning coffee into regular expressions and parsers".  That certainly describes me to a tee. :)

I've always found each of these the three facets of working with data fascinating.  One of the comments mentioned that decision making was an important missing trait.  I could go either way there.  I feel it is good to be able to tell a compelling story with the data that helps others to understand it, and then those people take the understanding you imparted to them and make decisions based on it.

It is incredibly hard to find a person who is skilled in just one or two of these facets.  When you find the data geek who has all three, then you count yourself lucky.  Expecting someone who has that caliber of devotion to data to also be capable of making decisions like a CEO is a bit unrealistic in my opinion.

Anyway, the article is a good, quick read.  It also quite nicely summarizes the major passions in my professional life right now.


Interesting crowd-sourced solution sites
Contemplation
daniele
A good friend of mine runs the site bug.gd (and it's more professional pseudonym, errorhelp.com).  This service provides something that is slightly missing from the typical Google search for an error to find a solution.  It allows you to enter the full text of the error message or stack trace instead of just a couple of keywords, and it provides rich community feedback on solutions.  You can even tip people for their solutions through tipjoy.com integration.

I recently came across two other nice sites created by a different company that provide a similar and complimentary service:
stackoverflow.com - A site dedicated to crowd-sourcing answers to programming questions
serverfault.com - A site dedicated to crowd-sourcing answers to system administration questions

I think it is very helpful to have a list of these sites that you can go to post a question and hopefully get an answer that will even be moderated by the community to help you determine the value of the answer.  This is something that typically takes a lot longer if you search for a forum or mailing list site and post there.  While it is less immediate than IRC, the moderation and ability to leave a question and get an answer "soon" are nice features you are less likely to see in IRC (although I've always gotten great results from #java, #sql, #mysql, and #bash).

As you can tell, I'm a big fan of crowd-sourcing.  I have run a couple of contests on 99designs.com and have been incredibly pleased with the results that came out of that community of freelance graphic designers.

Check these places out and see if they can help you or if you can help them!


Ways to visualize and share data
Clip Man
daniele
Mozilla needs to be able to provide useful extracts of data such as download trends, etc. and allow the community to perform their own analysis on them, so I'm always keeping a lookout for useful tools to further that goal.

When Tony Wright posted the blog entry Just How Important is the Valley? Let’s Look at some Data on April 17th 2009, he was kind enough to publish the data set (it needs an attribution / license though) and the data looked interesting so I thought I'd spend a little time playing with it using some tools that I've been keeping my eye on.

First, I slurped the table into DabbleDB, a website that is very well suited to messing with this type of data (i.e. sourced from the web, might need a bit of cleanup, etc.). You can view and edit the data I imported to DabbleDB here: Acquired Startups Data

DabbleDB does a great job at allowing a user to sort, filter, group, and modify data using a simple interface, but it does not have a large array of visualizations. For that, we head over here to the IBM AlphaWorks lab's project, Many Eyes Wikified.  I created a quick wiki dashboard for throwing together a few visualizations: Acquired Startups Visualizations

This was just a quick break from real work I've been doing, so I spent less than an hour on this.  I only took about 20 minutes with DabbleDB: importing the data, cleaning the dollar values, then creating two new views that group the data by country or by state for visualization.  Then I moved over to Many Eyes and played with a few visualizations to try to find some interesting views of the data and threw them into the dashboard and two sub pages.

Being able to quickly extract, transform, and visualize this data is the big win for DabbleDB and Many Eyes in my opinion.  With both applications having open licensing of the data and collaboration as a key focus, they are tools that I hope to be able to take advantage of at Mozilla soon.






Counting unique visitors in SQL
Clip Man
daniele
A lot of web metrics solutions out there like NetTracker or Omniture allow you to perform analysis on the number of unique visitors over time. This is a pretty important metric to a lot of companies, and I recently needed to perform such an analysis, but it was on data stored in a SQL database rather than in one of these proprietary solution's data-stores.

Doing any sort of distinct counting on a large volume of data in SQL can be very costly, both in terms of storage of the raw data (since you can't aggregate it), and in query performance since there are relatively few optimizations that can be performed on the table or the query.

Below are some highlights of how I implemented this.Collapse )
Tags: , , ,

TinyArro.ws URLs
Clip Man
daniele
A friend just released an URL shrinking service that I enjoy:  tinyarro.ws (more nifty when written as .ws).
It has a few great features over the current main stream shrinkers:

1. Cool/fun URLs (e.g. http://➽.ws/囨 for my website)
2. Very short URLs due to Unicode suffixes (great for Twitter!)
3. Preview by default! (no tweak to the URL to remember)
4. Option to enter your own custom suffix (TinyURL now has this, but it was too useful to not mention).
5. A Ubiquity command ›.ws/ (eventually to be integrated directly on the site)

Some news about the site:
TinyArro.ws: 10 new unicode domains. Defaulting previews to ON.
Ask HN: Thoughts on TinyArro.ws? Tiniest urls in the world (or your money back)
Tags:

Willingness to be a little evil
Contemplation
daniele
I have been a supporter of Firefox and Mozilla for several years now, and while I don't write patches and fix bugs, a major part of that support is educating people about Mozilla, open source, and user empowerment whenever a conversation about technology allows for it.

I've found that people who use proprietary software and operating systems often fall into two broad categories for rationalizing that choice:
1. They are told to do so by some authority (usually their employeer, sometimes their social tech support person, and in some cases, just because they were told it was the right thing to do by an ad or magazine article).
2. They started using it for some reason (typically reason #1 above) a long time ago and are now just accustomed to it.

I'm sure all this is going to be old news to most people reading this, but I bring it up because of an interesting article I read today.

In the 1960's and early 70's, psychologist Stanley Milgram performed a series of famous experiments that tested the willingness of people to do something they would normally object to on moral grounds when they are in a strictly controlled environment and instructed to do so by an authority figure.

More recently, psychologist Jerry Burger had the opportunity to perform a series of similar experiments.  This alternet article describes the story and discusses the findings.  As I read the results and Dr. Burger's statements regarding the findings, I started thinking about how easy it is for the people to choose to give up their freedom to a piece of proprietary software for reasons similar to the ones described in these experiments.

In a green field, these people would normally opt for software that provided them with more freedom and in many cases, subjectively better security, but because they are instructed by an authority figure, or because they got started with it a long time ago and just slid deeper and deeper in, those preferences are not enough by themselves to prompt the person to change their behavior.

Now even this thought in and of itself would not be enough to prompt me to blog about this topic.  We're still well in the territory where the people who haven't gotten lost in a Wikipedia article about toothbrush hygiene they found when they clicked my first link are saying, "um, DUH!"  So here is my point:

At the end of the article, Dr. Burger focuses on an interesting finding of both experiments.  When a person is instructed to do something "wrong", they are significantly less likely to do so if they are surrounded by peers who object first.

So when you talk to someone who is sighing about how much they hate product X but they don't have a choice, don't hate on them and don't deride them for not having a backbone, but just tell them and show them how you chose to stand up for your freedom and your security.  An example can go a long way toward giving them the courage to listen to that little voice inside saying, "I want something better!"


Bash functions for going up to a directory
Clip Man
daniele
Sometimes, if I'm in a really deep directory, I don't want to cd from / nor do I want to cd ../../../..
I just want to either go up 5 directories, or maybe I want to go up to the parent directory "src" when I'm in /home/dre/src/projects/foo/bar/classes/org/apache/blah

This set of Bash functions lets me do that.
The first, up() will change your directory. The second will instead just print the desired directory name.  This makes it easy for you to mv a file up higher or something.

If you pass no arguments, it just goes up one directory.
If you pass a numeric argument it will go up that number of directories.
If you pass a string argument, it will look for a parent directory with that name and go up to it.
(Note, there is a small display bug there. If you give it an invalid name, cd reports the "No such file or directory" error, which is good, but it has a bogus path. Since you can't know what path they were actually trying to go to, it should just say "No such parent directory: ${yourbogusname}". I don't have time to figure that out right now though.)

Just put these functions in your ~/.bashrc file and don't forget to source it. (  source ~/.bashrc )

function up()
{
    dir=""
    if [ -z "$1" ]; then
        dir=..
    elif [[ $1 =~ ^[0-9]+$ ]]; then
        x=0
        while [ $x -lt ${1:-1} ]; do
            dir=${dir}../
            x=$(($x+1))
        done
    else
        dir=${PWD%/$1/*}/$1
    fi
    cd "$dir";
}

function upstr()
{
    echo "$(up "$1" && pwd)";
}

All hail Ken Kovash!
Clip Man
daniele
It may be showing my ignorance, but I was unaware until recently of the officially recognized day for celebrating the man, the myth, and the math that is Ken Kovash.  To think that all the time leading up to this point, I had just been satisfied with the joyous feeling in my heart every day I interacted with him.

Ken can be a harsh task-master some times.
"Daniel, where are my numbers from yesterday?"
"Daniel, why are the funnelcake trends low here and high there? You're data are wrong, go find it and fix it!"
But the pain is worth it when I see him take my crude raw data and masterfully sculpt it into bounteous bevies of tables, raging rivers of trend lines, triumphant towers of bar charts, overwhelming ontologies of pie graphs, and gilt-edged grids of treemaps

One must weep to behold it.
 

Powered by ScribeFire.


You are viewing daniele