{"id":16254,"date":"2016-11-05T19:18:26","date_gmt":"2016-11-05T23:18:26","guid":{"rendered":"http:\/\/mjtsai.com\/blog\/?p=16254"},"modified":"2016-11-05T19:18:26","modified_gmt":"2016-11-05T23:18:26","slug":"h-264-is-magic","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2016\/11\/05\/h-264-is-magic\/","title":{"rendered":"H.264 Is Magic"},"content":{"rendered":"<p><a href=\"https:\/\/sidbala.com\/h-264-is-magic\/\">Sid Bala<\/a> (via <a href=\"http:\/\/shapeof.com\/archives\/2016\/11\/a_couple_of_fun_things_to_read.html\">Gus Mueller<\/a>, <a href=\"https:\/\/news.ycombinator.com\/item?id=12871403\">Hacker News<\/a>):<\/p>\n<blockquote cite=\"https:\/\/sidbala.com\/h-264-is-magic\/\"><p>Because now, you can take that frequency domain image and then mask out the edges - discard information which will contain the information with high frequency components. Now if you convert back to your regular x-y coordinates, you'll find that the resulting image looks similar to the original but has lost some of the fine details. But now, the image only occupies a fraction of the space. By controlling how big your mask is, you can now tune precisely how detailed you want your output images to be.<\/p>\n<p>[&#8230;]<\/p>\n<p>The Y is the luminance (essentially black and white brightness) and the Cb and Cr are the chrominance (color) components. RGB and YCbCr are equivalent in terms of information entropy. [&#8230;] But check out the trick: the Y component gets encoded at full resolution. The C components only at a quarter resolution. Since the eye\/brain is terrible at detecting color variations, you can get away with this. By doing this, you reduce total bandwidth by one half, with very little visual difference.<\/p>\n<p>[&#8230;]<\/p>\n<p>H.264 splits up the image into macro-blocks - typically 16x16 pixel blocks that it will use for motion estimation. It encodes one static image - typically called an I-frame(Intra frame). This is a full frame - containing all the bits it required to construct that frame. And then subsequent frames are either P-frames(predicted) or B-frames(bi-directionally predicted). P-frames are frames that will encode a motion vector for each of the macro blocks from the previous frame.<\/p><\/blockquote>\n<p><a href=\"https:\/\/news.ycombinator.com\/item?id=12874569\">warpzero<\/a>:<\/p>\n<blockquote cite=\"https:\/\/news.ycombinator.com\/item?id=12874569\"><p>I was hoping the author would write about H.264 specifically, for instance, how it was basically the &ldquo;dumping ground&rdquo; of all the little tweaks and improvements that were pulled out of MPEG-4 for one reason or another (usually because they were too computationally expensive), and why, as a result, it has thousands of different combinations of features that are extremely complicated to support, which is why it had to be grouped into &ldquo;profiles&rdquo; (e.g., <a href=\"http:\/\/blog.mediacoderhq.com\/h264-profiles-and-levels\/\">Baseline, Main, High<\/a>).<\/p><p>I was also hoping that he would at least touch on the features that make H.264 unique from previous MPEG standards, like in-loop deblocking, CABAC Entropy Coding, etc..<\/p><\/blockquote>","protected":false},"excerpt":{"rendered":"<p>Sid Bala (via Gus Mueller, Hacker News): Because now, you can take that frequency domain image and then mask out the edges - discard information which will contain the information with high frequency components. Now if you convert back to your regular x-y coordinates, you'll find that the resulting image looks similar to the original [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"","apple_news_api_id":"","apple_news_api_modified_at":"","apple_news_api_revision":"","apple_news_api_share_url":"","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[2],"tags":[566,357,97],"class_list":["post-16254","post","type-post","status-publish","format-standard","hentry","category-technology","tag-color","tag-compression","tag-video"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/16254","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=16254"}],"version-history":[{"count":1,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/16254\/revisions"}],"predecessor-version":[{"id":16255,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/16254\/revisions\/16255"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=16254"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=16254"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=16254"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}