Seeking Small: Why Big Retailers Envy Mom & Pop

There are a few local stores I visit often – enough so that they know my name, preferences, boyfriend’s name, work/travel schedule, and even my pets. They call me on my cell phone when something new is available that they know I’ll love. They make sure I know in advance about events they’re holding. In return, I’m glad to pick up the phone when I see that they’re calling.

These are the experts in personalization and targeted messaging. They are the kings and queens of the customer relationship. And the big box retailers of the world are tremendously jealous of them.
Continue reading

Netezza Stored Procedures and Optional Arguments

netezza Let’s say you have a process that requires a start date, but may or may not have an end date, as a parameter feeding in. How can you accomplish this in Netezza via a stored procedure? The internet has not had great answers to this question but it’s not as hard as some other programmers make it out to be. So today we tackle optional arguments – in easy mode. Continue reading

Fun Uses for Windowing Functions

I’ve recently come to love (read:be obsessed with) windowing functions in my coding. They’re just so useful and practical.

For those who haven’t experienced the joys of windowing, here’s the deal. They allow you to do calculations across multiple rows without actually having to group, thereby storing aggregate info on each record. That means you keep all the data associated with the row and can add calculated fields that rely on interaction with other rows. Pretty swiffy, huh?

Below are just a few funky functions that I’ve found helpful. I’m not saying that these aren’t resource intensive, but they may just save you from having to join to some crazy aggregation sub-queries and then export to Excel for further manipulation to get the same result. Continue reading

Quoted and Stuff

I did an interview a couple of weeks ago with Loyalty360 and now they’ve put out an article from that. It’s about Data Science and suchness. I guess I’m actually getting to be somebody… or something.

You can read the article here. It may require that you register but it should be free of charge.

Pretty Girls Don’t Do Math

A few days ago, I got in touch with the parent of a friend of mine from high school who is a college counselor with Kelleher Cohen Associates in the Boston area. Her job is to help high school students find colleges that fit their personality and academic needs, apply for financial aid, and the complete the application process for the schools.

During the course of conversation, she asked about what I was doing for work. When I started describing marketing analytics to her, she got even more inquisitive. Turns out she has a female student with whom she is working that is very interested in Math. It’s not the student’s best subject, but the one that she looks forward to every day. As this student nears college age, she has expressed that she will likely not pursue math further. When I asked why she would abandon a subject that she enjoys, she said that, according to the student, “Pretty girls don’t do math.” Continue reading

When Lag is Awesome

Imagine, if you will, begin able to calculate the time between visits to a website, transactions in a store, logs from a punch-clock, etc. in just one step. Well, I have found the way! Continue reading

Correlated Query Error in Netezza

Beta… you know… for correlation?

Fun and interesting error today. Here’s the actual error text:

Error 2: this form of correlated query is not supported – consider rewriting

I’d never heard of this “correlated query” business before so I had to look it up to sort out what was going on. Turns out that you can reference a table in the outside part of a query from within a subquery by calling the alias… Or, rather, you can’t in Netezza.

Tip of the day:

Check for alias references in your subquery and get rid of them Continue reading

Murphy’s Laws of Data Analysis

Damn you, Murphy! … Wrong Murphy? Damn you, anyway!

Most English-speakers are familiar with Murphy’s Law, which states:

Anything that can go wrong, will go wrong.

Realistically, this should read:

Anything that can go wrong, will go wrong… in the worst possible way at the most inconvenient time.

There are a lot of opportunities for Murphy’s Law to prove itself in the world of data analysis. Here are some of the ones that most often occur in my traipsing through the data. Continue reading

Calculating Percentiles in PostgreSQL

Slicing data into manageable chunks for viewing is crucial when you start dealing with more records than will fit in something like Excel (without PowerPivot, of course). One of the most common ways to look at data in a more easily-digestible manner is to use percentiles, or some derivative thereof, to group records based on a ranking. This allows you to then compare equal-sized groups to one another in order to form conclusions as to relative behavior.

Most SQL variants have workarounds for how to accomplish this task that may or may not actually cover 100% of your data (may drop a few records here or there when trying to round a percentage to a whole number of records to pull). PostgreSQL, on the other hand, has a handy function built in for doing this sort of thing without having to worry about getting full coverage on your table. Continue reading

Regular Expression in PostgreSQL

There are a whole set of fields in the databases I’m using here that are tilde-delimited (~) varchar strings with a mess of key-value pairs, the values from which I really need. Unfortunately, since they are varying character lengths, in no particular set order within that field, it is impossible to substring your way efficiently through them. Thankfully, there is a RegEx genius on my team who produced a handy chuck of code that pgSQL can easily recognize, parse and process for pulling precisely what I need. Continue reading

Lexy Kassan

The Truth Against the World