{"id":8722,"date":"2014-04-18T16:50:38","date_gmt":"2014-04-18T20:50:38","guid":{"rendered":"http:\/\/mjtsai.com\/blog\/?p=8722"},"modified":"2014-04-18T16:50:38","modified_gmt":"2014-04-18T20:50:38","slug":"stack-exchange-technical-debt","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2014\/04\/18\/stack-exchange-technical-debt\/","title":{"rendered":"Stack Exchange Technical Debt"},"content":{"rendered":"<p><a href=\"http:\/\/marcgravell.blogspot.com\/2014\/04\/technical-debt-case-study-tags.html\">Marc Gravell<\/a>:<\/p>\n<blockquote cite=\"http:\/\/marcgravell.blogspot.com\/2014\/04\/technical-debt-case-study-tags.html\"><p>So, we&rsquo;ve got inline tag data that is simple to display, but is virtually impossible to query. Regular indexing doesn&rsquo;t really work well at finding matches in the middle of character data. Enter (trumpets) <a href=\"http:\/\/technet.microsoft.com\/en-us\/library\/ms142571.aspx\">SQL Server Full-Text Search<\/a>. This is inbuilt to SQL Server (which we were already using), and allows all kinds of complex matching to be done using <a href=\"http:\/\/technet.microsoft.com\/en-us\/library\/ms142583.aspx\">CONTAINS, FREETEXT, CONTAINSTABLE and FREETEXTTABLE<\/a>. But there were some problems: <a href=\"http:\/\/en.wikipedia.org\/wiki\/Stop_words\">stop words<\/a> and non-word characters (think &ldquo;c#&rdquo;, &ldquo;c++&rdquo;, etc). For the tags that weren&rsquo;t stop words and didn&rsquo;t involve symbols, it worked great. So how to convince Full Text Search to work with these others? Answer: cheat, lie and fake it. At the time, we only allowed ASCII alpha-numerics and a few reserved special characters (+, &#8211;, ., #), so it was possible to hijack some non-English characters to replace these 4, and the problem of stop words could be solved by wrapping each tag in a pair of other characters. It looked like gibberish, but we were asking Full Text Search for exact matches only, so frankly it didn&rsquo;t matter. A set of tags like &ldquo;.net c#&rdquo; thus became &ldquo;&eacute;&ucirc;net&agrave; &eacute;c&ntilde;&agrave;&rdquo;.<\/p>\n<p>[&#8230;]<\/p>\n<p>We finally had a reason to remove this legacy from the past. [&#8230;] After some thought and consideration, we settled on a pipe (bar) delimited natural representation, with leading\/trailing pipes, so &ldquo;.net c#&rdquo; becomes simply &ldquo;|.net|c#|&rdquo;.<\/p><\/blockquote>","protected":false},"excerpt":{"rendered":"<p>Marc Gravell: So, we&rsquo;ve got inline tag data that is simple to display, but is virtually impossible to query. Regular indexing doesn&rsquo;t really work well at finding matches in the middle of character data. Enter (trumpets) SQL Server Full-Text Search. This is inbuilt to SQL Server (which we were already using), and allows all kinds [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"","apple_news_api_id":"","apple_news_api_modified_at":"","apple_news_api_revision":"","apple_news_api_share_url":"","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[2],"tags":[143,71,343],"class_list":["post-8722","post","type-post","status-publish","format-standard","hentry","category-technology","tag-database","tag-programming","tag-search"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/8722","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=8722"}],"version-history":[{"count":0,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/8722\/revisions"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=8722"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=8722"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=8722"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}