User:Colin Marquardt/OSM Tech

From OpenStreetMap Wiki
Jump to: navigation, search

Discussion in #osm in early December 2009:

22:34:11    jburgess * emarsden: I know some of the basics about quadtiles, like that they are used as a simple 2d index on the lat/lon data
22:34:18    _dodger_ * emarsden: Do you have any specific questions? I could point you to the code for the quadtree.
22:38:01    _dodger_ * emarsden: Did you get our answers before disconnecting?
22:40:28    emarsden * _dodger_: I did not
22:41:02    _dodger_ * <jburgess> emarsden: I know some of the basics about quadtiles, like that they are used as a simple 2d index on the lat/lon data
22:41:07    _dodger_ * <_dodger_> emarsden: Do you have any specific questions? I could point you to the code for the quadtree.
22:42:38    emarsden * OK, thanks. I take it that the quadtree is used to find the ways/nodes/relations in response to a map call?
22:43:07       zereb * emarsden: only the nodes. the ways and relations are found from linkage on the way_nodes and relation_members tables
22:44:08    jburgess * emarsden: they are only used for nodes & gps_points
22:45:25    emarsden * OK, so the geom VARCHAR field contains the quadtree information, which is a string of A/B/C/D equivalent characters?
22:45:46    jburgess * no, I think it is just a 32bit int
22:46:12       zereb * emarsden: the geom field is used in the mapnik rendering database. the node just has an integer field called tile.
22:46:51    emarsden * where can I find the schema?
22:47:12    _dodger_ *
22:47:34    jburgess * see also
22:47:56       zereb * or even
22:47:58    jburgess * though neither are guaranteed to be completely accurate (the definitive source is the ruby migrations)
22:48:43    jburgess * I believe 'tile' is the one which contains the quadtile info
22:49:46    jburgess * to generate the number, the world is first split into 65k x 65k xy grid. Then the bits of these two 16 bit numbers is interleaved to generate the 32 bit tile number
22:50:29    _dodger_ * Also called the Morton number
22:50:32       zereb * indeed. it contains a 32bit morton code created in this
22:50:43   balrog-kun was just going to mention Morton numbers
22:51:04    _dodger_ * But all that has very little to do with PostgreSQL - and that's what you were initially asking about.
22:51:17    jburgess * then when we need to lookup the data in a given bounding box we calculate the (small) number of tile numbers covered and fetch the data with matching tile numbers
22:51:54       zereb * well, it turns out that a 32bit tile code lookup using a btree index is about twice as fast as using gist/rtree index on the points
22:52:09    emarsden * great information, thanks
22:52:52    emarsden * I was going to ask what the gbtree* functions in apidb.sql correspond to
22:53:30    _dodger_ * zereb: That's quite fast! I wasn't aware of that. I always thought that the R-Tree performed quite well. On the other hand do the morton/quadtile numbers not be recalculated/redistributed.
22:53:50    _dodger_ * +need to
22:55:55       zereb * _dodger_: exactly. and, for a fairly even distribution, they're extremely efficient
22:56:35       zereb * in fact, i think the reported numbers were for queries only, no updates going on. i think it has more to do with the 
                       amount of optimisation that postgres' btree implementation has had
22:57:27       zereb * emarsden: the gbtree* functions are used to implement the btree/gist index which is used on the changeset bounding 
                       boxes. i don't think they're used for the tile code lookup - that's just a vanilla btree
23:02:23    emarsden * and the database is sometimes returning > 10M tuples per second. Impressive!
23:03:41       zereb * it's averaging 2.73M over the past week, spiking to 195M 
23:04:05       zereb * it's fast :-)
23:04:12    emarsden * I'm looking at the stats to try to determine the balance between reads & writes, but it isn't clear to me
23:05:51       zereb * looks like it's fetching about 3,500 times as many tuples as it's inserting, updating or deleting
23:05:57    emarsden * this is the ratio between "tuples fetched" + "tuples returned" and "tuples inserted/updated/deleted"?
23:06:32       zereb * i guess so.
23:08:00       zereb * the iostat says the disk array is averaging about 30k blocks/s read and about 3.4k blocks/s write
23:08:40    emarsden * so about 99% of reads are served from RAM?
23:09:18    emarsden * that's strange
23:10:02       zereb * it may not be that the tuples come from ram, there could just be more contiguous reads than writes
23:10:20    emarsden * ok
23:11:30       zereb * it does keep about 28Gb of cached data, so i wouldn't be surprised if most reads are served from ram
23:12:46    emarsden * where the entire database on disk is around 100Gb?
23:13:28       zereb * about 1.3Tb on disk
23:14:15       zereb * 1.23Tb, actually
23:14:56    emarsden * is that mostly history tables, or user diary spam, or?
23:16:07  elasticdog * does anyone know how much storage space a complete set of rendered Mapnik tiles will take up? (zoom 0-19)
23:16:19    _dodger_ * I'd guess that a big part of that comes from the GPS track points and the various indexes. But zereb will surely have a more detailed answer :)
23:18:11       zereb * elasticdog: over 4 exabytes, by a back-of-the-envelope calculation
23:18:25  elasticdog * I don't think that's right :-)
23:18:49    _dodger_ * elasticdog: That question was recently answered on one of the mailing lists (talk or dev) and someone said 240 GB but I have no idea how accurate that is.
23:19:12     bobkare * no way it's just 240GB down to z19
23:19:13       zereb * well, there are 4^n tiles at zoomlevel n, so sum(4^n, n=0..19) * 14kb (average tile size), which is 4.6 exabytes
23:19:21    jburgess * _dodger_: higher than that, the main tile server currently has over 500GB of tiles
23:19:52     dhansen * zereb: is an empty .png 14k? 70% of the worlds tiles will be solid blue :)
23:20:09    jburgess * I think the current average tiles size is closer to 1kB, so maybe in the 400TB region
23:20:19       zereb * dhansen: that's true - let's assume they're not stored and knock off 70% of that
23:20:20    _dodger_ * jburgess: Okay. I really have no idea - was just citing the mailing list. So elasticdog, ignore what I was saying, trust jburgess
23:20:25       zereb * so about 1 exabyte?
23:20:40  elasticdog * hehe...had this conversation with a co-worker yesterday, and we had ranges from very small to roughly 3 petabytes
23:20:48    jburgess * plus the compute power to render them all
23:21:21  elasticdog * I'm in the middle of a full render, but only up to zoom 12
23:21:43       zereb * jburgess: based on a 1kb average and dropping 70% water tiles, that's still 100 petabytes
23:21:43       zereb * jburgess: based on a 1kb average and dropping 70% water tiles, that's still 100 petabytes
23:21:50       apmon * elasticdog: Good luck with the exabyte ;-)
23:21:55       zereb * or have i dropped a power of 1024 somewhere?
23:22:28       zereb * yeah, i dropped one - 100 Tb
23:22:45  elasticdog * how are you calculating the full number of tiles? 4 ^ 18 ?
23:23:14       zereb * no, in ruby: (0..19).inject(0) {|s,x| s + 4**x }
23:23:30    jburgess * I make 2^19 * 2^19 * 1kB = 274TB
23:24:06       zereb * i then multiplied by 0.3 on the assumption that you wouldn't bother rendering or storing water-only tiles
23:24:21  elasticdog * but it's not smart enough to not render water tiles, right?
23:24:49    jburgess * don't forget that the 1kB number is including the current water tiles we have. Without them the average might be higher
23:24:51  elasticdog * I've found that each zoom level is roughly 3x the far, which I assuming is due to compression of the *.meta files?
23:25:36       apmon * elasticdog: metatiles are not compressed. They are just 64 .png files concatinated
23:25:59    jburgess * it is true though that as you zoom in you don't get much more data, it is just more spread out
23:26:05       apmon * The difference is I suspect that the higher the zoom, the simpler the tile. So it will compress in png form better
23:26:11    jburgess * so the higher zoom tiles probably do compress better
23:27:05  elasticdog * well, I'm hoping the real number will be in the 500 GB range
23:27:14     bobkare * also as zoom level increases there will be more blank tiles
23:27:46    jburgess * elasticdog: 500GB is possible, but you can not pre-render everything
23:27:51  elasticdog * it would be nice if it were smart enough to recognize all water tiles and render only one and have the rest point to it
23:28:14  elasticdog * jburgess: why not?
23:28:38  elasticdog * ohh...nevermind, you're saying that I can't prerender it all in under 500GB
23:28:41    jburgess * see the estimate above... rendering every tile gives you too much data
23:29:10  elasticdog * so, when doing a full-render, what level is realistic to cut it off at?
23:29:21    jburgess * but if you just render what people look at & cap the maximum cache size then you can handle a web site like with 500GB
23:29:23     bobkare * osmarender stops as z17
23:29:39    jburgess * provided you can render on demand when someone requests a tile that is not cached
23:29:50     bobkare * but it takes loads of distributed renderers many weeks to render everything
23:29:52    jburgess * you maintain the illusion that you have everything
23:30:34  elasticdog * I've got a fairly standard mod_tile/renderd setup, but that won't exactly do it on demand, just queue it up
23:30:36    jburgess * bobkare: do you know how much space the osmarender tiles take in total?
23:31:03     bobkare * no, don't remember. i think i've seen some numbers floating around, probably on the ML
23:32:07     bobkare * but z17 is the cutoff point where it stops being sane to prerender
23:33:10  elasticdog * jburgess: by max cache, do you mean the max cache that are rendered on the fly, or total tiles cached?
23:33:30    jburgess * I was talking about the maximum stored on disk
23:33:43  elasticdog * is there a place to limit that amount?
23:34:19    jburgess * nothing automatic, but you could probably write some scripts to periodically purge old tiles
23:34:32  elasticdog * gotcha
23:34:42       zereb * emarsden: very roughly, the nodes table is about 130Gb, current_nodes is 56Gb
23:35:02  elasticdog * so I'll probably kill this thing after z16...thanks for the advice everyone
23:35:42     emacsen * Hey OpenStreetMappers... Anyone here know if there's a way to make josm output an osmchange file
23:35:49     emacsen * or any change delta?
23:36:09    jburgess * elasticdog: there is a table on the wiki showing how many tiles have been viewed for each zoom level... I'll see if I can find it
23:36:27    jburgess * elasticdog:
23:37:08    jburgess * so up to around z12 pretty much all the tiles get looked at, past that the ratio that actually get viewd drops very rapidly
23:37:32       zereb * emarsden: all in all looks like the current_* tables are about 330Gb and the history ones about 690Gb. but very roughly speaking...
23:38:07       apmon * zereb: So gpx tables aren't all that big?
23:38:10    emarsden * zereb: thanks
23:38:27  elasticdog * jburgess: THANKS! I've been looking for a page like that...great data to know
23:38:44         ris * so we've run out of space and I should stop mapping?
23:38:58       zereb * apmon: gps points is apparently about 100Gb