{"id":7650,"date":"2013-07-30T16:16:20","date_gmt":"2013-07-30T21:16:20","guid":{"rendered":"http:\/\/mjtsai.com\/blog\/?p=7650"},"modified":"2013-07-30T16:16:20","modified_gmt":"2013-07-30T21:16:20","slug":"parsing-my-apache-logs","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2013\/07\/30\/parsing-my-apache-logs\/","title":{"rendered":"Parsing My Apache Logs"},"content":{"rendered":"<p><a href=\"http:\/\/www.leancrew.com\/all-this\/2013\/07\/parsing-my-apache-logs\/\">Dr. Drang<\/a>:<\/p>\n<blockquote cite=\"http:\/\/www.leancrew.com\/all-this\/2013\/07\/parsing-my-apache-logs\/\"><p>I stopped using Google Analytics a couple of months ago. [&#8230;] But I am still curious about which pages are being read and where the readers are coming from, so a wrote a little script to parse my site&rsquo;s Apache log file and return the top five pages and referrers for a given day. Along the way, I learned more about Python&rsquo;s <a href=\"http:\/\/docs.python.org\/2\/library\/collections.html\"><code>collections<\/code> library<\/a> and the <a href=\"http:\/\/docs.python.org\/2\/library\/re.html#match-objects\"><code>groupdict<\/code> method<\/a> for regular expression matches.<\/p><\/blockquote>\n<p>I&rsquo;ve been thinking of doing something similar. <a href=\"http:\/\/www.google.com\/analytics\/\">Google Analytics<\/a> is annoying to check because it requires logging in. My different sites are under different accounts, and it takes a lot of clicking around to get to the information that I want. <a href=\"http:\/\/haveamint.com\">Mint<\/a> doesn&rsquo;t filter referrers reliably and seems to increase the load on my server. By writing my own script that accesses the logs directly, I&rsquo;ll be able to track non-JavaScript requests (e.g. .dmg downloads) and also calculate some custom analytics that wouldn&rsquo;t be possible with off-the-shelf software.<\/p>","protected":false},"excerpt":{"rendered":"<p>Dr. Drang: I stopped using Google Analytics a couple of months ago. [&#8230;] But I am still curious about which pages are being read and where the readers are coming from, so a wrote a little script to parse my site&rsquo;s Apache log file and return the top five pages and referrers for a given [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"","apple_news_api_id":"","apple_news_api_modified_at":"","apple_news_api_revision":"","apple_news_api_share_url":"","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[4],"tags":[518,519,74,232,96],"class_list":["post-7650","post","type-post","status-publish","format-standard","hentry","category-programming-category","tag-googleanalytics","tag-mint","tag-opensource","tag-python","tag-web"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/7650","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=7650"}],"version-history":[{"count":0,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/7650\/revisions"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=7650"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=7650"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=7650"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}