April 26, 2011

How to find search keywords that leads users to your Website using Woopra

As I have mentioned before (here and here), Woopra is one of the best Analytics engine I’ve seen. I think you’d love it if you like to see real-time traffic to your site. Installing Woopra to your site is not very hard, use the guide here. I use it on my blog and find very interesting facts about what people search for on various search engines to land on my blog.

On a fine weekend, I wanted to fetch all the search keywords people use to come to my blog and save it per day in a file. I knew that Woopra provides API access (REST), and I wanted to write something quick to save this data. Instead of choosing C#, I chose Powershell to do this and just wanted to share the small script I came up with. (Pardon my dirty code, its just something I wrote very very quickly haven’t got a chance to clean it)

   1: Add-Type -AssemblyName System.Web
   2:  
   3: $a = New-Object XML
   4: $today = Get-Date
   5: $currentDir = Split-Path -parent $MyInvocation.MyCommand.Definition
   6: $currentDir
   7: for($i=0; $i -le 3; $i++)
   8: {
   9:     $td = $today.AddDays($i*-1)
  10:     $dayString = $td.Day,$td.Month,$td.Year -join "/"
  11:     $strURL = "http://api.woopra.com/rest/analytics/getqueries.jsp?website=jigar-mehta.blogspot.com&api_key=XXXXXXXXXX&date_format=dd/MM/yyyy&start_day=" + $dayString + "&end_day=" + $dayString + "&limit=100&offset=0"
  12:     $a.Load($strURL)
  13:     $strDate = [string]::Format(".\{0:ddMMyyyy}{1}", $td, ".txt")
  14:     Write-Host "Fetching for" $td.ToLongDateString() "into" $strDate
  15:     "============================================================================ " + $td.ToLongDateString() | out-file $strDate -encoding UNICODE
  16:     foreach($iii in $a.response.items.item)
  17:     {
  18:         $strTemp = $iii.name -replace "\+", " "
  19:         [Web.Httputility]::UrlDecode($strTemp) | out-file $strDate -append -encoding UNICODE
  20:     }
  21: }

In order to use above script, you will need to replace website=”jigar-mehta.blogspot.com” with your website name and api_key=XXXXXXXXXX with your own account’s API key (You can get it at, https://www.woopra.com/members/settings/api.jsp?website=<YourWebsiteName>, assuming you are logged in).



It will create text files (something similar to following) in the same directory as powershell script with keywords that led users to your blog.


image


A sample output for single day queries to my blog looks as follows,



   1: ============================================================================ Friday, April 22, 2011
   2: dell inspiron 1525 drivers
   3: windows phone 7 icon
   4: dell inspiron 1525 wireless network driver
   5: download dell inspiron 1525 intel chipset utility for windows xp
   6: 1525 driver xp
   7: youtube gujarati natak comedy
   8: dell inspiron 1525 bluetooth driver download
   9: inspiron 1525 video driver
  10: windows phone style icons
  11: youtube gujarati natak
  12: how to install hyper-v manager on windows 7
  13: device driver for inspiron 1525
  14: jigar mehta
  15: free online bug tracking system
  16: ગુજરાતી નાટક
  17: youtube gujarati movie
  18: ગુજરાતી natak
  19: dell inspiron 1525 audio drivers
  20: gujarati drama bas kar bakula
  21: gujarati natak
  22: dell insprion 1525 driver download xp
  23: dell inspiron xp drivers
  24: functional testing visual studio 2010
  25: jigar container movers
  26: gujarati natak list
  27: dell inspiron 1525 network intel driver
  28: dell inspiron 1525 video drivers for windows xp
  29: download application bar icon for windows phone 7
  30: hosted isuue tracker free
  31: nayan ne bandh rakhine jyare tamne joya chhe lyrics
  32: dell inspiron 1525 audio driver
  33: sairam dave jokes from prem etle vahem download link
  34: extract vhd
  35: dell inspiron 1525 download drivers xp
  36: extract .vhd files .iso
  37: inspiron 1525 xp drivers
  38: online free jira
  39: sairam dave prem jokes.zip
  40: gujarati natak on youtube list
  41: videosearch.rediff.comvideo_play.php?id
  42: vhd extract
  43: side by side execution assembly
  44: youtube .gujarati.movei.new
  45: cr-48

Powershell rocks!