Post

Tactic #1: A night at the movies

In Risk on November 8, 2009 by hudgeon Tagged: , ,

Carrying on from the previous post, the top 20 contribution clusters provide some interesting insights into potential tactics for the Democrats and Republicans to pursue to increase their contributions. The top 20 are as follows:

There’s quite a breadth of industries represented in the top 20 ranging from TV/Movies/Music in California, to Securities and Investment in Connecticut to Lawyers in Florida and Homemakers in Michigan (Homemakers in Michigan?!?) and Hospital/Nursing Home staff and Health Professionals in Texas.

Each of these clusters represents an opportunity to identify why a particular cluster has developed in a particular region and determine whether that environment can be replicated in other regions.

The stand-out cluster is the “TV/Movies/Music” cluster at the top of the list with 3 times the level of contribution of the next highest cluster.

Let’s look at who’s in the cluster:

Whilst it may be interesting to note that Stephen Spielberg, Seth Rogen, Rob Reiner, Jamie Foxx, Melanie Griffith, Michael Douglas and Jim Carrey all live in the same area and donate about the same amount of money to the Democrats, it’s not apparent at first how this knowledge can help us. After all, how do you replicate communities of left-leaning Hollywood actors across the country (and would you want to)?

The above chart shows the dates the donations that cluster’s donations were made. Interestingly, over 1/2 of all the contributions in this cluster occurred on the same day or the day after.

Whilst you may not be able to replicate Hollywood across the country, you can at least identify why similar contributions were made on this day and attempt to replicate that across the entertainment industry contributors.

Post

US Political Contributions 2010: Cluster Analysis

In Risk on November 8, 2009 by hudgeon Tagged: , , ,

Over the past 10 years, I’ve spent much of my time pulling actionable insights out of data. Perhaps I shouldn’t admit this out loud, but I enjoy it.

I’ve recently been experimenting with an approach I’ve developed for identifying clusters of similar transactions within a data set. Seeing every group of clustered transactions within a data set enables one to identify the most interesting clusters, deep dive into why those transactions are clustered and devise tactics to get more clusters if they’re ‘good’ or fewer clusters if they’re ‘bad’.

The approach I’m experimenting with compares every record to every other record for large data sets. This is non-trivial given the exponential growth in the number of comparisons as the dataset grows. Consider for example a data set with 3 million rows of data and 4 columns. To compare each record with every other record may require 36 trillion calculations (3M * 3M * 4). It’s easy to cut this number by more than 1/2 by not comparing each record to itself and by only doing the comparison in one direction, i.e. assuming that comparing A to B is the same as comparing B to A, but that still leaves almost 18 trillion calculations and reducing that number further gets trickier.

I’ll discuss my approach in more detail over the coming posts but let’s take a look at some results:

The maps below show the political contributions made by individuals to political parties in the US in 2010. Let’s imagine that we are working for a particular political party and we want to increase the amount of contributions we receive from individuals. One way of doing this would be to identify groups of similar people who contribute large amounts of money, understand why those people are big contributors, and then attempt to create more of those groups of people across the country. To do this, we’ll look at the base data set, we’ll run it through the clustering algorithm, we’ll identify the top 20 clusters that we want to investigate further and devise tactics for replicating the ‘good’ clusters.

The underlying data set was created by Open Secrets, a fantastic group in the US collecting, cleansing and posting political data.

Full data set map

The above maps provide some interesting tidbits such as in 2010 year-to-date, Puerto Ricans contributed to Republicans, Democrats and Independents whereas Hawaiians prefer to stick with the major parties (note that the data set shows only direct contributions and excludes PAC contributions). Whilst this information is interesting, it is not particularly actionable and doesn’t help us achieve our goal of increasing the amount individuals are contributing to our party.

Now let’s run the data set through the clustering algorithm and look at the clusters:

The following map shows clusters of contributions with a similarity rating of 98%, containing 10 or more transactions, contributing more than USD $10,000 to the coffers of the party (using v0.01a of the software).

This looks better. We can see some interesting information now and we can start asking some questions. What is the large Democrat cluster in LA? Are the Republican clusters in Florida replicable across the country or are they particular the region? It looks like the Democrats in South Texas and Michigan are doing some amazing things. What are they?

But the first question we’ll ask is  ”What are the top 20 clusters by contribution amount and what tactic should we apply to replicate the top cluster?”

(To be continued…)

By the way, the reporting tool used to generate the above charts is Tableau. You can download the Tableau data set used in the above analysis here and the csv file here. Tableau has a free ‘reader‘ version that you can use to explore the data further. If you have the full version of Tableau you can create your own charts using the data set. I’d appreciate seeing your findings.

Post

Goodbye for now…

In Vendor Management on July 17, 2009 by hudgeon Tagged:

It’s been an exciting few months in my world. Over the past year I’ve been moving out of procurement / vendor management and into shared services more broadly. This has culminated in a couple of promotions and posting to Delhi for a year were I’ll be working out of our Gurgaon office. Given my new focus and new locale, I’m parking this blog for a while.

I’ll re-commence when I again have something to say.

Thanks for your time over the past three years. I’ve greatly enjoyed the ongoing dialogue I’ve had with you and I feel honoured that my missives, coherent or otherwise, have informed and entertained.

All the best,

Doug

Post

US 2008 Election Campaign Spend Data – Part 02

In Vendor Management on April 16, 2009 by hudgeon Tagged: , ,

The first thing you want to do when you get some spend data is look at it grouped into meaningful categories. I typically categorise data using a 5-stage process:

Standard spend categorisation

  1. First, categorise every row as “Other”. (This allows you to claim that you have categorised 100% of your data and you’re only on step 1!)
  2. Second, categorise each row according to a GL/Category mapping table. (This ensures you get the tail spend.)
  3. Third, categorise each row according to a vendor mapping table for your significant vendors. (This ensures you’re not embarrassed by a PA who has for reasons unknown placed a KPMG invoice into “Kitchen Supplies”.)
  4. Fourth, categorise each row according to a mapping table for vendor and GL. (This ensures that you split your Staples spend into stationery and office equipment etc.)
  5. And finally, categorise each row according to key phrases in the description. (This ensures you pull out PWC legal spend from the Audit category.)

Categorising Campaign Data

Fortunately for me, the OpenSecrets campaign data is expense coded and a mapping table was posted to the OpenSecrets Google Group yesterday.

16-04-2009-101559-pm1

This allows me to change my Qlikview load script to include a second table that I have called expensecodes.csv. I make sure that my expense code column label matches the label in my spend data csv file (expends08.csv) and Qlikview does the rest.

Viewing Categorised Data

Now, I want to get a quick view of spend by SectorName and DescripLong (the 2-level categorisation assigned by OpenSecrets.

One way to view this type of data is as a tree map:

treemap1

Creating the Tree map takes only a couple of minutes in Qlikview. We can see that Administrative cost is the single largest cost category running at about 25% of expenditure with Contributions and Transfers together comprising another 25%.

Drilling into the data

Qlikview’s speed of drilling down  into data (filter) is its biggest drawcard. By default, every chart you create is linked to the filters you select on any other chart. For example, if we are interested in seeing our spend v number of suppliers chart above for Internet Media, we need only click on the Internet Media square above and then view the chart.

 

internet-media-suppliers

 

Now let’s look at the same chart for Print Media. We remove the Internet Media filter and click on Print Media.

 

print-media

 

Interestingly, you can see that the Print Media supply base is much more fractured than Internet Media. This is probably because, as the candidates start travelling faster from state to state, they start dealing with local Print Media suppliers in each state; whereas they can continue to deal with the same Internet Media providers.

Qlikview’s speed is impressive. We are looking at 2.9 million rows of data on a three-year old laptop computer. Each drill-down completes in under a second.

Post

US 2008 Election Campaign Spend Data – Part 01

In Vendor Management on April 15, 2009 by hudgeon Tagged: , , ,

OpenSecrets.org cleans and posts unclassified US political data. There’s lots of great data here including 2.9 million rows of third-party supplier spend data from the 2008 US election campaign.

I’ll pass over just how amazing it is that anyone with an interest in this data can access it immediately and access it for free but I encourage you to pause and consider how much the world has changed in the past 15 years.  On to my purposes for accessing the data:

1. Loading the data

The first thing you need to do with a data set is get it into whatever tool you’re using to analyse it.

Qlikview is a great tool I’ve used for loading large data sets. In less than 4 minutes, Qlikview slurps in the 2.9 million rows of data and you’re ready to start analysing.

2. Running reports

The first report I’ve run is to look at the spend per month compared to the number of suppliers used per month. The chart below shows that both the spend and the number of suppliers increases dramatically in the 4 months leading up to the election. The convergence of the lines indicates that spend per supplier is increasing significantly as well which may indicate a fattening of margin in addition to an increase in transaction volumes. Unsurprisingly, this spend decreases quickly in November and December once the TV ads stop airing. Note that the chart below uses two axis with spend on the left and count of distinct suppliers on the right.

spendvdistinctsuppliers1

Tomorrow, I’ll discuss Qlikview’s real differentiator: drill down capability that replicates the speed of running queries against a spend cube without the pain (and forward planning) involved in building a cube.

Comments Off

Post

Business quote of the week » 20 Years of GameBoy

In Vendor Management on April 6, 2009 by hudgeon Tagged: , ,

After the big GameBoy success there was an era, that did not felt really Nintendo. Namely GameCube and early N64. It seems that also lead developer, senior managing director and legend Shigera Miyamoto felt somehow unfamiliar during the “GameCube era”. He said:

“There was an era when Nintendo was going in the direction of doing the same things other companies did. The more we competed with new companies entering the market, the more we started acting similar to them. But is being number one in that competition the same as being number one with the general public? That’s the question we had.”

via » 20 Years of GameBoy Digital Tools – Game Design, Computer Art, Homebrew and Colorful Entertainment..

Comments Off

Post

E-Sourcing: Destroyer or Doyen of Value

In Vendor Management on April 2, 2009 by hudgeon Tagged: , ,

Tim Cummins from the IACCM has questioned a recent Supply Excellence article on “E-Sourcing Activities & Supplier Relationships: A Match Made in Purchasing Heaven” asking “E-Sourcing: Does it destroy value?

Tim’s objection is that buyers use auctions to impose terms on suppliers rather than engage in dialogue.

As a data guy, I’m a big fan of auctions because they provide a controlled environment to collect and disseminate information – sellers see what their competition is willing to do and buyers get an overview of the market.

Nevertheless, I agree with a lot of what Tim says because auctions create information flow in only 2 of the 3 possible directions. Data flows from supplier to supplier and from supplier to buyer but it does not flow from buyer to supplier.

This can create a situation that I will call the Incumbent’s Curse. It’s the other side of the coin from the Winner’s Curse. In the winner’s curse, the winning respondent in a tender finds that, in the heat of the auction, they have paid way too much. In the Incumbent’s Curse, the incumbent supplier cannot win the tender because they know how damn expensive this buyer is to service.

Imagine a supplier who delivers products to offices. In the spec in the tender, the buyer has not mentioned that it takes 30 minutes to get through security and that deliveries can only take place between 7:30am and 8:00am. The incumbent knows this and knows that this adds 1.5% to the cost of servicing the contract. The other respondents do not know this and have a 1.5% “advantage” in the auction. The incumbent loses the contract. The winner unwittingly shaves 1.5% from their margin.

Perhaps we need an auction facility that encourages a three-way dialogue.

Post

Chinese BPO capability, shadow banking and climate change

In Vendor Management on March 30, 2009 by hudgeon Tagged: , , ,

  • 13:12:30: Secret weapon of the Chinese BPO industry “Because their customers are Chinese they [have no] labor cost differential.” http://bit.ly/TIHuP
  • 13:17:26: reportonbusiness.com: Reviving the shadow banking system “I highly recommend we put brakes on it” http://bit.ly/r4n06 Yep.
  • 23:01:18: Dyson on climate change: “inductive science is of limited utility when the object of study is an extremely rare event” http://bit.ly/oYGlD

Tweets copied by twittinesis.com

Comments Off

Post

Startups, global sourcing and supply chain ownership

In Vendor Management on March 29, 2009 by hudgeon Tagged: , , ,

Near sourcing startups

  • 20:15:27: The Pros of Planting Startups in Smaller Cities – BusinessWeek This will be a long term trend, I bet. http://bit.ly/yX5K

The important factors in supplier selection

  • 20:27:54: The…fall of global sourcing “market prices, expected quality and supply chain flexibility [matter]“. Location does not http://bit.ly/un7H

Supply chain ownership

  • 21:12:04: Fabindia Weaves in Artisan Shareholders – BusinessWeek http://bit.ly/DAl7T Joint supply chain ownership – great to see this implemented!

Tweets copied by twittinesis.com

Post

Time management, crowdsourcing and patents

In Vendor Management on March 25, 2009 by hudgeon Tagged: , , , ,

Time management

  • 22:27:42: Review:The Age of Speed http://bit.ly/Py7eu Don’t read books on time management. Just stop doing anything that doesn’t lead to your goal.

Patents v. markets

Crowdsourcing

  • 22:42:02: Jeff Howe on Crowdsourcing http://bit.ly/cWsiH The fickleness of the crowd is the biggest hurdle to cross in crowdsourcing.

Entrepreneurship

  • 22:44:29: MIT Sloan Mgmt Review “notes the importance of … testing ideas rather than becoming attached to them from the start.” http://bit.ly/NSsiI
  • 22:51:55: Startup “rather than…trying to overcome all obstacles…think about the entrepreneurial path as…testing hypotheses” http://bit.ly/WhLz8

Tweets copied by twittinesis.com

Comments Off