Loading Data into Netezza Using Create External Table

Standard

punch_card.75dpi.rgbNetezza is a super-fast platform for databases… once you have data on it. Somehow, getting the data to the server always seems like a bit of a hassle (admittedly, not as big a hassle as old school punchcards). If you’re using Netezza, you’re probably part of a large organization that may also have some hefty ETL tools that can do the transfer. But if you’re not personally part of the team that does ETL, yet still need to put data onto Netezza, you’ve got to find another way. The EXTERNAL TABLE functionality may just be the solution for you. Continue reading

Common Table Expressions versus Temp Tables in Netezza

Standard

keep-calm-with-or-without-youThis seems to be either a controversial or overly-technical topic: should you use WITH (a common table expression a.k.a. CTE) or a TEMP TABLE. Both can serve similar purposes but each has their own strengths and weaknesses in how they work with other aspects of your query or procedure.

So today we’ll take a look at both without going into crazy detail but covering at least the basics.
Continue reading

Epic Epoch Time in Netezza and PostgreSQL

Standard

EpochTimeBeganEpoch time is both a blessing and a curse. It is super-convenient for counting seconds (and doing calculations based on them) but can also be a pain to try to get into something readable as, or comparable to, a recognizable date. So today we’ll get into and out of epoch to show its flexibility without our brains having to be contortionists too. Continue reading

Seeking Small: Why Big Retailers Envy Mom & Pop

Standard

mom-n-pop-shopThere are a few local stores I visit often – enough so that they know my name, preferences, boyfriend’s name, work/travel schedule, and even my pets. They call me on my cell phone when something new is available that they know I’ll love.¬†They make sure I know in advance about events they’re holding. In return, I’m glad to pick up the phone when I see that they’re calling.

These are the experts in personalization and targeted messaging. They are the kings and queens of the customer relationship. And the big box retailers of the world are tremendously jealous of them.
Continue reading

Fun Uses for Windowing Functions

Standard

I’ve recently come to love (read:be obsessed with) windowing functions in my coding. They’re just so useful and practical.

For those who haven’t experienced the joys of windowing, here’s the deal. They allow you to do calculations across multiple rows without actually having to group, thereby storing aggregate info on each record. That means you keep all the data associated with the row and can add calculated fields that rely on interaction with other rows. Pretty swiffy, huh?

Below are just a few funky functions that I’ve found helpful. I’m not saying that these aren’t resource intensive, but they may just save you from having to join to some crazy aggregation sub-queries and then export to Excel for further manipulation to get the same result. Continue reading

Pretty Girls Don’t Do Math

Standard


A few days ago, I got in touch with the parent of a friend of mine from high school who is a college counselor with Kelleher Cohen Associates in the Boston area. Her job is to help high school students find colleges that fit their personality and academic needs, apply for financial aid, and the complete the application process for the schools.

During the course of conversation, she asked about what I was doing for work. When I started describing marketing analytics to her, she got even more inquisitive. Turns out she has a female student with whom she is working that is very interested in Math. It’s not the student’s best subject, but the one that she looks forward to every day. As this student nears college age, she has expressed that she will likely not pursue math further. When I asked why she would abandon a subject that she enjoys, she said that, according to the student, “Pretty girls don’t do math.” Continue reading

Murphy’s Laws of Data Analysis

Standard
Damn you, Murphy! ... Wrong Murphy?  Damn you, anyway!

Damn you, Murphy! … Wrong Murphy? Damn you, anyway!

Most English-speakers are familiar with Murphy’s Law, which states:

Anything that can go wrong, will go wrong.

Realistically, this should read:

Anything that can go wrong, will go wrong… in the worst possible way at the most inconvenient time.

There are a lot of opportunities for Murphy’s Law to prove itself in the world of data analysis. Here are some of the ones that most often occur in my traipsing through the data. Continue reading

Calculating Percentiles in PostgreSQL

Standard

Slicing data into manageable chunks for viewing is crucial when you start dealing with more records than will fit in something like Excel (without PowerPivot, of course). One of the most common ways to look at data in a more easily-digestible manner is to use percentiles, or some derivative thereof, to group records based on a ranking. This allows you to then compare equal-sized groups to one another in order to form conclusions as to relative behavior.

Most SQL variants have workarounds for how to accomplish this task that may or may not actually cover 100% of your data (may drop a few records here or there when trying to round a percentage to a whole number of records to pull). PostgreSQL, on the other hand, has a handy function built in for doing this sort of thing without having to worry about getting full coverage on your table. Continue reading

Data Really Is Sexy

Standard

Recently, in speaking with the President of Kobie regarding my team, I used the term “data munging” to describe a lot of the work that we do. He laughed, thinking I had said “data munching” (mmmm, tasty!) and asked if that was a technical term. The short answer is that yes, it is another term for data wrangling (which, incidentally, is one of my favorite terms in the industry). Continue reading