{"id":179,"date":"2012-01-09T03:28:00","date_gmt":"2012-01-09T09:28:00","guid":{"rendered":"http:\/\/elysianshadows.com\/2012\/01\/optimizations-and-drastic-performance-improvements\/"},"modified":"2012-01-09T03:28:00","modified_gmt":"2012-01-09T09:28:00","slug":"optimizations-and-drastic-performance-improvements","status":"publish","type":"post","link":"http:\/\/elysianshadows.com\/updates\/optimizations-and-drastic-performance-improvements\/","title":{"rendered":"Optimizations and Drastic Performance Improvements"},"content":{"rendered":"\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">On Friday, I decided that I was completely fucking sick of the loadtimes and lack of responsiveness on the Windows build. I spent literally the entire day doing absolutely nothing but performance profiling and optimizations&#8230; the results are what you see in the repositories. The loadtimes are 100% eliminated, BUT there are still a few small issues with regards to moving the selection around the scene on certain platforms. My changes required some SERIOUS internal rewriting, and we are still performing testing and resolving a few small issues. We are aware of them. \ud83d\ude00<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">[b]Tested Platforms[\/b]:<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">1) Falco&#8217;s work i7 (Windows) &#8211; works flawlessly<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">2) All of our Macbook pros &#8211; works flawlessly<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">3) Jarrod&#8217;s PC (Windows) &#8211; works flawlessly<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">4) Tyler&#8217;s laptop (Windows) &#8211; serious redraw issues with tile selection<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">5) Our server (Kubuntu) &#8211; serious redraw issues with tile selection<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">* &#8211; even the two with redraw issues had absolutely no loadtimes and a drastic improvement in performance.<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">[size=20]So what the fuck did you do?[\/size]<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">Wheeeeeeell&#8230; gather round, gentlemen. After a bit of profiling, it turned out that the unforgivable load times on Windows were due to two contributing issues&#8230;<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">[b]1) Tile Cutting Algorithm[\/b]<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">I had a pretty farfetched theory with this one, and it turns out that I was right.&nbsp;<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">[u]Fig 1. Reested Code[\/u]<\/p>\n<pre>QImageReader reader; reader.setFileName(loc); QRect clipRect(0,0,spriteWidth,spriteHeight); int id = 0; if(!reader.canRead()) { \tDebug::Logf(Debug::CRITICAL, \"Could not read image file %s!\", loc.toStdString().c_str()); \treturn false; } for(; clipRect.y() &lt; (int)sheetHeight; clipRect.translate (0,spriteHeight)) { \tfor(clipRect.moveTo(0,clipRect.y()); clipRect.x() &lt; (int)sheetWidth; clipRect.translate(spriteWidth,0)) { \t\treader.setFileName(loc); \t\treader.setClipRect(clipRect); \t\tQImage image = reader.read(); \t\tif(image.isNull()) { \t\t\tQString error = reader.errorString(); \t\t\tDebug::Log(Debug::CRITICAL, \"TileManager::DivideTileSheet(): There was an error reading the image! Error message: \" + error); \t\t\treturn false; \t\t} \t\tVisualTile *tile=new VisualTile; \t\ttile-&gt;image=QPixmap::fromImage(image); \t\ttile-&gt;id=id++; \t\tcontainer.push_back(tile); \t} } return true; <\/pre>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">It turns out that our original, QImageReader implementation of the algorithm works as such:<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">1) fetch image from hard drive<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">2) seek to location<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">3) read in small portion of image<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">4) close image<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">(REPEAT FOR SHEET)<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">The shit performance was due to the opening and closing of the sheet from the drive for each tile loaded&#8230;<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">[u]Fig 2. Pristine Code[\/u]<\/p>\n<pre>QImage image(loc); QRect clipRect(0,0,spriteWidth,spriteHeight); int id = 0; if(image.isNull()) { \tDebug::Logf(Debug::CRITICAL, \"Could not read image file %s!\", loc.toStdString().c_str()); \treturn false; } for(; clipRect.y() &lt; (int)sheetHeight; clipRect.translate (0,spriteHeight)) { \tfor(clipRect.moveTo(0,clipRect.y()); clipRect.x() &lt; (int)sheetWidth; clipRect.translate(spriteWidth,0)) { \t\tVisualTile *tile=new VisualTile; \t\ttile-&gt;image = QPixmap::fromImage(image.copy(clipRect)); \t\ttile-&gt;id=id++; \t\tif(tile-&gt;image.isNull()) { \t\t\tDebug::Log(Debug::CRITICAL, \"TileManager::DivideTileSheet(): There was an error reading the image!\"); \t\t\tdelete tile; \t\t\treturn false; \t\t} \t\tcontainer.push_back(tile); \t} } return true; <\/pre>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">My new algorithm reads the entire picture into RAM, then copies subsections of the image. There is only one read from the hard drive.<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">The effect on loadtimes was fucking insane. [i]We went from 5.2+ seconds to about 50ms.[\/i] Yes, bitches. That&#8217;s 100x faster.<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">[b]2) QGraphicsScene Population[\/b]<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">This has been a known issue for awhile&#8230; I just haven&#8217;t been able to devise a workable solution. The problem is that for every tile, we used a QGraphicsItem to render its image&#8230; That&#8217;s exactly how the QGraphicsParadigm is meant to be used. No big deal, right? Well, we have a 200&#215;200 map at 4 layers&#8230; 200x200x4 = 160,000 QGraphicsItems. Amazingly enough, this seemed fine on Linux and OSX&#8230; but for some reason, QT&#8217;s Windows implementation just couldn&#8217;t handle the sheer number. That was out of our control.<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">I finally had an epiphany on Friday&#8230; Rather than having an entire set of QGraphicsItems to represent each layer, I could use a single grid of QGraphicsItems whose paint() functions were overloaded to [i]render all four layers[\/i]. Boom! Windows can populate a scene with 60k items easily. All of the scene population happens the instant you load the Toolkit. It will never need to do that again.<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">[b]IN ADDITION TO THE ABOVE REEST[\/b]<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">I have discovered a way to hardware accelerate QT&#8217;s QGraphicsView by using OpenGL. On various platforms, this results in a huuuuuuuuuge performance increase. On some (with shitty OpenGL drivers), this is actually slower. Jarrod is going to be implementing a Toolbar UI at the top of the Toolkit with handy icons for common actions (flip\/rotate tiles, reload sheets, invoke engine, etc). A &#8220;toggle hardware acceleration&#8221; button will certainly be up there as well. You guys will love it.<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">So there yo have it. Between the two optimizations, there are no longer any loadtimes in the Toolkit. My top priority is to resolve the redraw issues. Once this is done, I will feel very comfortable encouraging everybody to start using the &#8220;optimized-as-shit&#8221; build. \ud83d\ude00<\/p>\n","protected":false},"excerpt":{"rendered":"<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\"><strong>Originally posted by Gyrovorbis (Falco Girgis) in our Private Development forum on 1.9.12.<\/strong><\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">[quote=&#8221;pritam&#8221;]Edit: I re-cloned the toolkit repo and with exception to lack of viewport repaint on xp, the absence of the freeze feature is glorious![\/quote]WHEEEEEEEEEEEEEEEEELL!!!!<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">You weren&#8217;t supposed to know about that new feature yet. \ud83d\ude09<\/p>\n<p style=\"color: #333333; font-family: Tahoma, Helvetica, Arial, sans-serif; font-size: 12px; line-height: 15px;\">Well, since people are pulling down Jarrod and my &#8220;extremely optimized beta-as-shit edition,&#8221; I had better start talking about it. &nbsp;\ud83d\ude06&nbsp;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-179","post","type-post","status-publish","format-standard","hentry","category-underlying-technology"],"_links":{"self":[{"href":"http:\/\/elysianshadows.com\/updates\/wp-json\/wp\/v2\/posts\/179","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/elysianshadows.com\/updates\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/elysianshadows.com\/updates\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/elysianshadows.com\/updates\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/elysianshadows.com\/updates\/wp-json\/wp\/v2\/comments?post=179"}],"version-history":[{"count":0,"href":"http:\/\/elysianshadows.com\/updates\/wp-json\/wp\/v2\/posts\/179\/revisions"}],"wp:attachment":[{"href":"http:\/\/elysianshadows.com\/updates\/wp-json\/wp\/v2\/media?parent=179"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/elysianshadows.com\/updates\/wp-json\/wp\/v2\/categories?post=179"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/elysianshadows.com\/updates\/wp-json\/wp\/v2\/tags?post=179"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}