K-Bal wrote:...and send only one vertex per tile and inflate it with geometry shader to a full quad for even less transfer overhead.
Sure hope this dude is not targeting a mobile device without support for GLES 3 then... We can't even use geometry shaders because most mobile devices are still cock-blocked at GLES2.
K-Bal wrote:As dandymcgee pointed out, don't do this until you need the performance.
I guess I can't really argue against this philosophy, but if your targeted device is mobile, the chances are that you are IMMEDIATELY going to need the performance if your tile geometry is not stored GPU-side. Not to mention client-side "anything" storage is deprecated in modern GL ES... the dude doesn't have a choice for mobile.
K-Bal wrote:Also depending on the situation I would aggregate common data for faster access. So instead of an array of "struct Tile" I would use arrays of vertices, tile ids and so on.
I wouldn't. You can almost never guarantee that something like this is going to be an optimization when it comes to GPU architectures. You're going to be striding the SHIT out of the cache for every tile when its data is all over the place in disjoint arrays like that. A GPU is optimized for making the most out of every access into global memory, and organizing data as an array-of-structs rather than structure-of-arrays means that after the initial access, more of the data is immediately available without having to revisit global memory and striding the shit out of the cache... I can almost guarantee you this will actually be a performance hit unless you have a SHITLOAD of cache lines or a small data set.
Also aggregating data like that is going to cause an additional layer of indirection if you need to use data from one array to index into another array to make anything useful in your shader. You're going to have better performance if it's preprocessed and is all a single, contiguous, interwoven set. Sure, he's going to have duplicate uv coordinates for vertices with the same tile IDs, and in our case, the VBO is WAAAY bigger, but it's going to be way faster due to its cache coherency.