OpenAerialMap/Meeting Feb 26, 2015

From OpenStreetMap Wiki
Jump to navigation Jump to search

Feb 26 19:02:27 Cristiano: Good morning - who's here for the OAM weekly?
Feb 26 19:03:04 FrankW: is listening in for the meeting.
Feb 26 19:03:29 lossyrob: Hi, Rob Emanuele from Azavea here.
Feb 26 19:03:40 Cristiano: Hi Rob!
Feb 26 19:03:42 chippy_: i am here for a little bit
Feb 26 19:04:10 Cristiano: kiwiirc.com seems to have some issues, in case people were trying to enter from the link on the wiki
Feb 26 19:04:30 Cristiano: Hi Chippy - good to have you here!
Feb 26 19:04:39 chippy_: (just wanted to offer services / advice for online georectification. We rectified online 3G geoeye imagery on mapwarper during Haiti crisis)
Feb 26 19:04:54 Cristiano: Let me just send a note to the list about using the other IRC link - 1 sec
Feb 26 19:05:11 sanderd17: Cristiano, not for me ^ (just logged in with kiwi irc)
Feb 26 19:05:56 Cristiano: oh good, well for some reason it did not let me
Feb 26 19:06:23 nhv: waves
Feb 26 19:07:49 Cristiano: Here's some topics for discussion I put together https://docs.google.com/document/d/1NomAhJdQZW7vU7zfFhB9rjBjCEaJWcrJEbJQj1f-pzI/edit?usp=sharing
Feb 26 19:08:04 Cristiano: feel free to add or comment there directly
Feb 26 19:09:35 Cristiano: What I would like to do today is pick up from the recent threads on the mailing list and see if we can define a workflow for OAM
Feb 26 19:10:31 Cristiano: and see what pieces of software from all those other cool projects we can already use for some of the processing and tiling components
Feb 26 19:11:03 Cristiano: priority for OAM is really not to re-invent the wheel, re-use and keep it simple
Feb 26 19:11:52 Cristiano: Hi mojodna and jj0hns0n!
Feb 26 19:12:00 Cristiano: and wildintellect
Feb 26 19:12:22 BlakeGirardot: Hi all.
Feb 26 19:12:34 mojodna: Cristiano: hi! digesting the gdoc. the topics are a good breakdown of orthogonal concerns.
Feb 26 19:13:28 Cristiano: we should also think of how we can make the central/main OAM nodes able to scale, and at the same time being able to package everything into a laptop install for those working in the field/disconnected
Feb 26 19:14:01 Cristiano: maybe it's asking too much :) ... but let's see what are your ideas
Feb 26 19:15:04 lossyrob: can you go over the use case a little? For someone using this against a scalable server solution vs someone in the field using a laptop
Feb 26 19:15:41 Cristiano: oh, and by the way, I would like to focus on the processing/tiling parts since we already discussed the catalog and metadata aspects
Feb 26 19:15:43 mojodna: data-wise, i’m partial to S3 (or an S3-compatible service) being the source of truth. laptops / individuals could pull subsets for use within the rendering / processing pipeline.
Feb 26 19:15:52 Cristiano: sure lossyrob
Feb 26 19:17:12 mojodna: to support people in the field (with, say, drones), we should handle the notion of staging data for merging w/ the source of truth
Feb 26 19:17:27 mojodna: (which would also apply to the process of ingesting data from any other source)
Feb 26 19:18:04 wildintellect: +1 to S3 compatible
Feb 26 19:18:21 jj0hns0n: yeah, +1 on S3 for me too
Feb 26 19:18:39 lossyrob: +1 S3
Feb 26 19:18:48 FrankW: I'd like that as a primary data home.
Feb 26 19:19:10 Cristiano: yes, those are the types of examples, but def consider the scenario for someone in the field with fresh UAV collect, trying to LAN share the data with local mapper teams
Feb 26 19:19:16 jj0hns0n: as far as I understand, openstack storage api is nearly identical to s3
Feb 26 19:19:24 wildintellect: should be
Feb 26 19:19:26 lossyrob: 'merging with source of truth'...the raw data will be separate in ingest, correct? then the merging is what...mosaicing?
Feb 26 19:19:27 mojodna: jj0hns0n: ditto google cloud storage
Feb 26 19:19:34 wildintellect: hence I said compatible
Feb 26 19:19:35 Cristiano: so hopefully we can give them an easy tool for doing it that, drag-ndrop style
Feb 26 19:20:21 lossyrob: Cristiano, a local collection would be perhaps an initial tool, and then the imagery could be pushed upstream to S3 when they are connected
Feb 26 19:21:03 Cristiano: exactly - and at the same time update the central index with the availability of that dataset
Feb 26 19:22:02 mojodna: lossyrob: i think it remains separate after being ingested too, just sliced up and lightly reprocessed (if necessary). post-processing (one-time or dynamic) would handle the mosaicking.. (?)
Feb 26 19:22:10 lossyrob: seems like the local collection tool, the raw image storage & metadata storage (central index), are separate pieces, with other processing against raw imagery another concern
Feb 26 19:22:20 mojodna: +1
Feb 26 19:22:36 jj0hns0n: yeah, this is the proposed architecture as I understand it lossyrob
Feb 26 19:23:52 lossyrob: so the post-processing steps (which I'm a bit murky on) should be available to a local machine, as well as processed on the server side and made available to the public?
Feb 26 19:24:43 wildintellect: Anything upload to either a local or central catalog
Feb 26 19:24:57 Cristiano: yes, processiing can be availble both locally - client sends ready tiles - or on the OAM server side - clients sends Geotiffs
Feb 26 19:24:59 jj0hns0n: I think we are conflating the necessary set of processing steps by introducing the idea of 'raw' imagery from uav
Feb 26 19:25:03 wildintellect: needs to be piped through a processing step to put it all in the standard proj and tiling scheme
Feb 26 19:25:40 Cristiano: raw in this case is the ready geotiff - georeferenced and hopefully orthorectified
Feb 26 19:25:50 jj0hns0n: the most basic 'local' use case is to take some ortho imagery that someone else hands you, do some basic image processing to stretch or whatever so you can look at it and make a local cache that you can serve to others on your network then upload to the 'cloud' and update the index
Feb 26 19:26:09 Cristiano: we should not have to deal with raw imagery like in jpegs from UAVs + GPS frame center
Feb 26 19:26:15 wildintellect: correct
Feb 26 19:26:22 jj0hns0n: in my mind, that can/should just all take place in a qgis plugin
Feb 26 19:26:26 wildintellect: we recommend OpenDroneMap, etc for those things
Feb 26 19:26:29 lossyrob: 'raw' to me represents what Chris Holmes mentioned in his post: https://s3-us-west-2.amazonaws.com/landsat-pds/L8/001/008/LC80010082015051LGN00/index.html
Feb 26 19:26:49 jj0hns0n: yeah, to me that kind of processing is way out of scope for OAM
Feb 26 19:27:14 jj0hns0n: its not OAMs role to produce nice mosaics from raw landsat right?
Feb 26 19:27:18 Cristiano: lossyrob: that's the type of raw we mean :)
Feb 26 19:27:28 wildintellect: Cristiano, not quite
Feb 26 19:27:32 jj0hns0n: yeah, I dont think so
Feb 26 19:27:38 wildintellect: Landsat is more that 3 Bands
Feb 26 19:27:40 jj0hns0n: leave that to people that know about that stuff :)
Feb 26 19:27:45 nhv: during Katrina I got imagery directly off the camera and georectified it
Feb 26 19:27:52 wildintellect: someone needs to simplify it to RGB before we take it
Feb 26 19:27:54 Cristiano: well, assuming it's one ortho scene, why not?
Feb 26 19:28:06 FrankW: note that landsat is already orthorectified, not super raw.
Feb 26 19:28:11 wildintellect: right
Feb 26 19:28:14 FrankW: But it does imply rescaling, etc.
Feb 26 19:28:16 Cristiano: oh sorry, did not look though, I though it was the ready geo RGB
Feb 26 19:28:33 lossyrob: sorry yeah, I'm htinking geotiff RGBA
Feb 26 19:28:41 wildintellect: it is geo but its more bands than that
Feb 26 19:29:09 wildintellect: and each band on that page is a separate file
Feb 26 19:29:14 FrankW: I think a preliminary restriction to input data being RGB or RGBA 8bit ortho data is reasonable.
Feb 26 19:29:17 jj0hns0n: my larger point is that I think it should be out of scope for OAM to be concerned with producing landsat mosaics from a set of individual scenes
Feb 26 19:29:35 wonderchook: yeah, I could see OAM pointing to that as a resource
Feb 26 19:29:41 wildintellect: sure, if someone else processes landsat and makes it available to use - then we would catalog it
Feb 26 19:29:42 FrankW: is here in part to represent the landsat case, but I can see it as a complicated case.
Feb 26 19:29:44 wonderchook: but not creating the actual mosaic
Feb 26 19:29:52 jj0hns0n: right, there are plenty of toolchains for doing this kind of thing right
Feb 26 19:29:59 FrankW: jj0hns0n: does that mean no real mosaicing smartness at all?
Feb 26 19:30:12 jj0hns0n: no, there needs to be lots of mosaicing smarts
Feb 26 19:30:20 nhv: again for Katrina I got most sat data in 16bit form
Feb 26 19:30:22 lossyrob: so, OAM is not trying to mosaic incoming drone imagery for map use?
Feb 26 19:30:34 wildintellect: no
Feb 26 19:30:46 Cristiano: not mosaicking other than overlay sources
Feb 26 19:30:49 wildintellect: you send it through Pix4, or OpenDroneMap first
Feb 26 19:30:50 nhv: as this was the quickest route to near realtime imagery
Feb 26 19:30:53 jj0hns0n: lossyrob, yes I think so, but not 'raw' imagery, something that someone has already produced an ortho mosaic from right?
Feb 26 19:30:54 wildintellect: then upload it to OAM
Feb 26 19:31:43 Cristiano: right, but in some cases it could be just a single L8 or geoeye scene - think of disaster response situations
Feb 26 19:32:00 wildintellect: UAV -> local stitching-> Georef/Ortho -> Upload to OAM
Feb 26 19:32:21 wildintellect: L8 -> RGB extraction -> Upload to OAM
Feb 26 19:32:44 wildintellect: we only start at the Upload to OAM part
Feb 26 19:32:48 mojodna: some process -> RGB GeoTIFF -> Upload to OAM
Feb 26 19:33:02 jj0hns0n: right, I concur with wildintellect and mojodna ... that is the scope of OAM
Feb 26 19:33:12 lossyrob: ok. so the post-processing part that some of us have been talking about, is a separate concern
Feb 26 19:33:16 lossyrob: gotcha
Feb 26 19:33:17 wildintellect: yes
Feb 26 19:33:20 jj0hns0n: not necessarily
Feb 26 19:33:31 nhv: that is a good place to start
Feb 26 19:33:32 wildintellect: OpenDroneMap is in that territory
Feb 26 19:33:41 wildintellect: as are many other tools
Feb 26 19:33:46 jj0hns0n: but procuding a single global mosaic out of all this random imagery is in scope for oam right
Feb 26 19:33:49 nhv: but once you have 8bit rgb it is easy
Feb 26 19:33:51 Cristiano: The post-processing for OAM is more like re-projecting and tiling
Feb 26 19:33:59 wildintellect: jj0hns0n, sure
Feb 26 19:34:14 wildintellect: how to mosaic different sets from different sources is our tricky part
Feb 26 19:34:36 jj0hns0n: yep, and that is where tools like geotrellis et al can be very useful right?
Feb 26 19:34:42 mojodna: wildintellect: track priority and overlay according to that?
Feb 26 19:34:48 mojodna: needs to look into geotrellis again
Feb 26 19:34:52 wildintellect: could be
Feb 26 19:34:53 lossyrob: jj0hns0n, that's the hope!
Feb 26 19:34:55 jj0hns0n: mojodna exactly, that is the hardest part
Feb 26 19:35:15 jj0hns0n: lossyrob could be ossim or other toolchains too, but I do think that geotrellis may make the most sense
Feb 26 19:35:21 wildintellect: fyi, I know a couple of data sources we can start with for testing purposes
Feb 26 19:35:40 mojodna: jj0hns0n: can you elaborate on the difficulty for this n00b?
Feb 26 19:35:46 nhv: imo you keep each layer separate onto itself and let the user decide what to combine overlay etc
Feb 26 19:35:46 wildintellect: PublicLab's mapknitter output (Google takes some already), and USDA NAIP
Feb 26 19:36:15 FrankW: The NAIP is available on S3 (BTW)
Feb 26 19:36:19 mojodna: nhv: +1, with defaults
Feb 26 19:36:36 nhv: NAIP on S3 sweet :-)
Feb 26 19:37:21 nhv: NASA's onearth gdal driver looks pretty good for tiling overviewing etc
Feb 26 19:37:26 mojodna: for this to work, the central store needs everything stored in the same projection (right?). has that been decided?
Feb 26 19:37:27 FrankW: nhv: the NAIP is unfortunately a user pay bucket, which is a bit complicated, but hey.
Feb 26 19:37:37 nhv: gottcha
Feb 26 19:37:39 Cristiano: so, if we start from where OAM picks up the data -NAIP or L8 on S3 for example -, what do you see the logical workflow being there. How do things get orchestrated and prioritized to go out as oam.org/tms?
Feb 26 19:38:34 lossyrob: images would need to be reprojected, tiled, and mosaiced, and then served out in a TMS service
Feb 26 19:38:38 nhv: FrankW have you looked at https://github.com/nasa-gibs/mrf
Feb 26 19:39:15 mojodna: we need to track the URL(s) and the data extent somewhere (presumably that was the metadata / catalog convo)
Feb 26 19:39:25 wildintellect: mojodna, correct
Feb 26 19:39:28 lossyrob: and if it's done in a streaming fashion, where new uploads to S3 would kick off processes that merged the new imagery into an existing dataset, that would have to be managed
Feb 26 19:40:06 jj0hns0n: lossyrob also uncompressed etc in order to optimize for rendering/tiling
Feb 26 19:40:15 Cristiano: right, so I was really curious/fascinated about vapor-clock for managing all that
Feb 26 19:40:17 mojodna: lossyrob: “merged the new imagery into an existing dataset” — let’s unpack this piece.
Feb 26 19:40:25 Cristiano: if I understand it right..
Feb 26 19:41:09 lossyrob: jj0hns0n, well the tiles could be stored compressed, uncompressed per request, is what we've been doing (though we paint the rasters dynamically as well, not imagery data)
Feb 26 19:41:25 Cristiano: yeah, we probably dont want to merge them, just overlay them... and remove/cleanup where encessary
Feb 26 19:42:16 lossyrob: Ok, so each new image kicks out old imagery that covers the same area?
Feb 26 19:42:17 mojodna: lossyrob: what’s your native format?
Feb 26 19:42:30 lossyrob: we store rasters as byte arrays.
Feb 26 19:42:31 mojodna: lossyrob: that seems…lossy
Feb 26 19:42:34 jj0hns0n: lossyrob rendering from things like jp2 or mrsid is really slow. Getting everything into a 'native' format makes it easier to optimize the rendering
Feb 26 19:42:35 lossyrob: but we ingest GeoTIFF
Feb 26 19:42:40 mojodna: (re: kicking out old imagery)
Feb 26 19:43:04 lossyrob: haha....yes it does
Feb 26 19:43:16 lossyrob: but I'm confused about overlay vs merge
Feb 26 19:43:19 jj0hns0n: Crschmidt spent a lot of time figuring out optimal formats for rendering through mapserver once upon a time
Feb 26 19:43:22 nhv: thinks the NASA MRF format might be just what is needed
Feb 26 19:44:17 mojodna: Cristiano: is there background/history on overlay vs merge?
Feb 26 19:44:23 nhv: tiled overviewed tiffs are hard to beat if the tiling scheme is 'correct' eg matches what is going to be requested
Feb 26 19:44:49 Cristiano: it's not kicked out, just overlaid but something new, but then we should still track previous imagery versions through something like VRTs like Seth was suggesting
Feb 26 19:44:52 jj0hns0n: nhv MRF does look exactly like what we need
Feb 26 19:45:00 nhv: :-)
Feb 26 19:45:21 Cristiano: nhv: can you expand for us? :)
Feb 26 19:45:25 jj0hns0n: https://github.com/nasa-gibs/mrf/blob/master/spec/mrf_spec.md
Feb 26 19:45:32 jj0hns0n: ^ Cristiano
Feb 26 19:45:33 wildintellect: if it works for Modis it should scale for ue
Feb 26 19:45:35 wildintellect: us
Feb 26 19:45:36 mojodna: nhv: if the target is TMS, then storing semi-authoritative copies as 3857 GeoTIFFs to overlay from seems sensible
Feb 26 19:45:53 wildintellect: yup that is the target
Feb 26 19:46:00 mojodna: (at resolutions matching the nearest integral zoom)
Feb 26 19:46:02 wildintellect: as lousy as that might be in some place
Feb 26 19:46:33 lossyrob: if you're using something like cassandra or accumulo, then each tile at each zoom level is it's own entry. So we index tiles based on a 2D or 3D Z curve
Feb 26 19:46:36 mojodna: it suggests an overview scheme as well.
Feb 26 19:46:49 lossyrob: i.e. don't rely on a file format to index tiles for us
Feb 26 19:46:58 nhv: lossyrob: that makes sense
Feb 26 19:47:13 nhv: eg morton index or some such
Feb 26 19:47:32 lossyrob: right. and we're working on a Hilbert curve implementaiton too
Feb 26 19:48:33 jj0hns0n: lossyrob now that its public, Im curious how mrgeo's implementation compares to geotrellis, but that is a discussion for a different time/channel
Feb 26 19:48:55 lossyrob: yup, I could definitely discuss details and a lot of thoughts on that subject
Feb 26 19:49:01 Cristiano: you are loosing me here :) ...so the MRF would be used for the processed tiles or the input data?
Feb 26 19:49:01 wildintellect: hmm, so the other proj option is 4326
Feb 26 19:49:24 nhv: there is some morton index c code in this project http://www.geodyssey.com/tileshare/index.html
Feb 26 19:49:35 nhv: to accelerate tile fetching
Feb 26 19:49:38 mojodna: wildintellect: the advantage of 3857 (with matching resolutions) is that pixels are 1:1
Feb 26 19:49:43 lossyrob: I think the suggestion was to use MRF as a storage backing to actually serve out TMS tiles, is that correct?
Feb 26 19:49:51 nhv: yes
Feb 26 19:49:57 wildintellect: I'd forgotten that we wanted to consider the option for people to pull WCS of the original inputs
Feb 26 19:49:58 jj0hns0n: that is my understanding lossyrob and nhv
Feb 26 19:50:14 lossyrob: how does that scale?
Feb 26 19:50:16 jj0hns0n: wildintellect I think we determined that to be out of scope for now too right?
Feb 26 19:50:43 Cristiano: yes, no WCS for now
Feb 26 19:50:47 wildintellect: there are some issues with usefulness of 3857 at some latitudes
Feb 26 19:51:03 Cristiano: just simple PNG TMS
Feb 26 19:51:04 mojodna: if the output is TMS, it has the same issues.
Feb 26 19:51:07 wildintellect: really I suppose we just let the tile zoom level go as high as the data has
Feb 26 19:51:14 lossyrob: nhv, nice I'll take a look. Here is a proposal for a general-use Space Filling Curve library: http://www.locationtech.org/proposals/sfcurve
Feb 26 19:51:27 Cristiano: so, can GDAL write MRF?
Feb 26 19:51:44 mojodna: we don’t preclude future WCS if we hang onto source files in 4326
Feb 26 19:51:53 mojodna: (so 3857s are derivatives)
Feb 26 19:51:54 nhv: it can with that github code
Feb 26 19:51:58 jj0hns0n: mojodna yeah, dont preclude but not a primary concern for now right?
Feb 26 19:52:16 FrankW: would be interested in incorporating the MRF driver upstream.
Feb 26 19:52:29 nhv: FrankW++
Feb 26 19:52:37 jj0hns0n: we want to go from individual ortho images to MRF files and only keep some reference to the original files for metadata purposes
Feb 26 19:54:01 nhv: another interesting GDAL driver not mainstream https://bitbucket.org/chchrsc/kealib
Feb 26 19:54:04 Cristiano: cool - so it sounds like gdal will do without any need for mapserver or other tiling software in the middle right?
Feb 26 19:54:16 nhv: that we could potentially exploit
Feb 26 19:54:17 jj0hns0n: frankw, I can imagine going from rpf -> mrf would make alot of people happy
Feb 26 19:54:43 FrankW: :-)
Feb 26 19:55:21 jj0hns0n: or reallly nitf -> mrf
Feb 26 19:55:53 lossyrob: so basically taking a set of ortho images -> MRF tiled, right? I'm not sure what GDAL would do with the overlaps in that case.
Feb 26 19:56:40 Cristiano: I know Alex already argumented about the issues, but is there any other option for this type of processing that could take advantage of GPU processing? -thinking the laptop case
Feb 26 19:56:51 nhv: note there is other code in the NASA-gibbs github site that exploits the MRF driver
Feb 26 19:57:40 mojodna: Cristiano: reproduction / derivative generation (overlays)
Feb 26 19:57:49 mojodna: s/reproduction/reprojection/
Feb 26 19:59:09 wildintellect: http://comments.gmane.org/gmane.comp.gis.gdal.devel/23404
Feb 26 20:00:21 Cristiano: mojodna; do you take advantage of the AWS GPU cluster for GPU processing in your workflow?
Feb 26 20:00:27 wildintellect: shttp://mojodna.net/2015/01/27/resolved-gdal-on-aws-gpus.html
Feb 26 20:00:30 Cristiano: I saw a reference in yout email
Feb 26 20:00:32 FrankW: wildintellect: Seth from that thread is now my coworker here at Planet Labs. We still argue about the value of GPU for general raster processing.:-)
Feb 26 20:00:53 mojodna: s/Seth/Even/ (I’m seth)
Feb 26 20:01:04 mojodna: it did speed things up a bit
Feb 26 20:01:12 mojodna: but it was kind of a whatever thing
Feb 26 20:01:13 FrankW: mojodna: He is Seth Price.
Feb 26 20:01:16 mojodna: oh!
Feb 26 20:01:24 mojodna: ha!
Feb 26 20:01:37 FrankW: In any event, I'd discourage GPU complication for now.
Feb 26 20:01:48 wildintellect: seems we have experts on the topic in house, and I defer to them
Feb 26 20:01:56 wildintellect: I agree get it working
Feb 26 20:02:01 Cristiano: OK, great!
Feb 26 20:02:13 wildintellect: then we can always add optional GPU/Parallelization where it might git
Feb 26 20:02:14 wildintellect: fit
Feb 26 20:03:05 Cristiano: right. And GDAL already does multithread processing for most things right?
Feb 26 20:03:43 Cristiano: FrankW: and I guess tiling?
Feb 26 20:04:11 FrankW: No, GDAL does not generally make much use of multithreading, though it should mostly be threadsafe for reading.
Feb 26 20:04:18 FrankW: So we might want to take advantage of that.
Feb 26 20:04:32 FrankW: depending on what we wrap things in (ie. GeoTrellis)
Feb 26 20:04:57 wildintellect: keep in mind the other way to think about it is just processing more images at the same time in parallel
Feb 26 20:05:05 FrankW: wildintellect: +1
Feb 26 20:05:13 wildintellect: rather than trying to speed up the processing of a single image
Feb 26 20:05:17 nhv: finds multiprocessing works well enough much easier then multithreading
Feb 26 20:05:17 Cristiano: OK. I'm saying if we ship it bundled with the OAM package to a laptop, could it make use of all the CPU power there?
Feb 26 20:05:28 wildintellect: sure
Feb 26 20:05:44 Cristiano: OK, so it just depends how we manage processes then
Feb 26 20:05:45 wildintellect: you have 8 cores, run 8 images at a time
Feb 26 20:05:56 wildintellect: your disk might choke 1st
Feb 26 20:06:02 wildintellect: if you don't have an SSD
Feb 26 20:06:24 Cristiano: cool- I'm thinking when we have one single geotiff from a UAV collect
Feb 26 20:06:45 mojodna: Cristiano: slice it up as a first step?
Feb 26 20:06:47 lossyrob: if you're working on one image, then no to multiprocess, yeah? or could you jsut use GDAL windows for that
Feb 26 20:06:58 lossyrob: right, you could cut it up into blocks first, true
Feb 26 20:07:08 mojodna: windows = same thing, yeah?
Feb 26 20:07:24 Cristiano: of course, I'm just thinking of the inexperience user
Feb 26 20:07:29 lossyrob: well, reading the window into memory vs writing tiles to disk first
Feb 26 20:08:02 Cristiano: so that he can just drag-n-drop the full geotiff and have it TMS ready without much intervention
Feb 26 20:08:20 nhv: as wildintellect alluded to most of the time these processes are io bound
Feb 26 20:08:25 lossyrob: if it's single, then yeah wouldn't gdal2tiles just take care of that?
Feb 26 20:08:47 lossyrob: for the single image local case
Feb 26 20:09:02 Cristiano: OK. talking about disk then... last thing I wanted to touch on is about serving and strategies for optimization there.
Feb 26 20:09:21 Cristiano: how we handle publishing and load balancing
Feb 26 20:09:57 Cristiano: say on the local instance where all nodes are on the same hardware and then on the main OAM nodes sitting in AWS with scalable resources
Feb 26 20:10:20 mojodna: S3 (with static data) for main OAM, MBTiles exports for laptops?
Feb 26 20:10:55 nhv: suggests looking at the nasa-gibbs resources he pointed out earlier the on-earth stuff has had a lot of trial and error testing and is pretty efficient
Feb 26 20:11:01 Cristiano: no, the actual serving of the TMS
Feb 26 20:11:24 lossyrob: Haven't experimented with this yet, but S3 with cloudfront caching as a TMS tile store, have a simple TMS service that just points the requests to the appropriate S3 file
Feb 26 20:11:31 Cristiano: openaerialmap.org/tms or mylaptop/oam/tms
Feb 26 20:11:54 wildintellect: once the tiles are made any tile handler will work
Feb 26 20:12:00 nhv: MRF you just point to the appropriate block in a file
Feb 26 20:12:16 wildintellect: well actually once the tiles are made the urls are known
Feb 26 20:12:22 Cristiano: OK, so you just point to a TMS structure on S3?
Feb 26 20:12:31 wildintellect: well it's tricky
Feb 26 20:12:39 wildintellect: depending on if you want specific sets
Feb 26 20:12:42 nhv: if you are serving out of a tiled overviewed files it is just one file
Feb 26 20:12:43 wildintellect: or the latest mosaic
Feb 26 20:13:01 wildintellect: or time based index
Feb 26 20:13:03 lossyrob: Cristiano, Sure, then latency becomes the issue.
Feb 26 20:13:20 lossyrob: nhv one file won't scale (in the OAM cloud instance)
Feb 26 20:13:34 nhv: ok 16 files :-)
Feb 26 20:13:53 wildintellect: well sorta, if you have copies of it and load balance it would
Feb 26 20:14:06 lossyrob: right, then you have to maintain copies of the entire dataset...eek!
Feb 26 20:14:12 wildintellect: I agree
Feb 26 20:14:26 nhv: anyway I think you will find that large tiled files are more efficient then a gazzillion small files
Feb 26 20:14:32 mojodna: the advantage of a TMS structure on S3 is that there’s no set of servers to maintain
Feb 26 20:14:54 mojodna: and it’s Amazon’s responsibility to deal with the gazillion small files
Feb 26 20:14:57 lossyrob: right. and the potential S3 latency is the issue. So front it with a cache...
Feb 26 20:15:01 mojodna: right
Feb 26 20:15:25 Cristiano: I just can't picture it without some sort of "mapserver" indexing and directing to all the source TMSes....
Feb 26 20:15:27 mojodna: the disadvantage of a TMS structure on S3 is that all tiles need to be generated up-front
Feb 26 20:15:44 wildintellect: Cristiano, thats the catalog instance
Feb 26 20:15:44 mojodna: Cristiano: can you elaborate?
Feb 26 20:15:52 Cristiano: do we need a routine to update main overview every t?
Feb 26 20:15:54 wildintellect: well each dataset is it's own TMS
Feb 26 20:15:56 lossyrob: Cristiano, you would have a small service pointing requests to the appropriate S3 url
Feb 26 20:16:06 wildintellect: in addition to being part of the mosaic TMS
Feb 26 20:16:12 mojodna: and each combination of datasets is its own TMS
Feb 26 20:16:21 lossyrob: i would write it in Spray, it would be not many lines, could be scaled up and load balanced
Feb 26 20:16:44 lossyrob: but then again I'm a Scala dude, so choose your own poison :)
Feb 26 20:16:45 wildintellect: we do need one TMS that is the footprints of available TMS
Feb 26 20:16:58 wildintellect: and that could be created on the fly on the main catalog
Feb 26 20:17:10 wildintellect: to help you find the imagery set you want
Feb 26 20:17:20 mojodna: wildintellect: +1
Feb 26 20:17:22 lossyrob: would that need to be TMS, or some other service? Like a catalog service
Feb 26 20:17:24 Cristiano: OK, so you can have a master TMS pointing to all the sub-TMSes? But that needs to be an image too, not just a footprint right?
Feb 26 20:17:42 wildintellect: lossyrob, well it's actually a CSW based on our prev discussions
Feb 26 20:17:42 lossyrob: that could be queried, and then you would choose the TMS based on that info.
Feb 26 20:17:51 wildintellect: but you want a map of them
Feb 26 20:18:04 wildintellect: people want a map to zoom to a place and look at what's an option
Feb 26 20:18:14 lossyrob: gotcha
Feb 26 20:18:36 wildintellect: Cristiano, yes there is a master mosaic
Feb 26 20:18:46 wildintellect: but it can only hold the most recent image for a place
Feb 26 20:18:53 Cristiano: or I guess we don't, and just use a base L8 global base and then just overlay all other TMSes
Feb 26 20:19:05 wildintellect: maybe
Feb 26 20:19:21 Cristiano: right, OK. so but we don't have to build that every t, right?
Feb 26 20:19:41 wildintellect: well depends on what you're using for the base
Feb 26 20:20:00 wildintellect: yes there are new images from L8 everyday 1/16th of the world
Feb 26 20:20:03 wildintellect: but you don't want those
Feb 26 20:20:18 wildintellect: you want a base composite cloud free from the previous year
Feb 26 20:20:24 mojodna: this is making me think that static tiles don’t make sense
Feb 26 20:20:35 wildintellect: basically what google earth engine has
Feb 26 20:20:44 FrankW: has TMS global cloudless landsat mosaic tiles in s3.
Feb 26 20:21:01 wildintellect: oh good, well that can be the base then
Feb 26 20:21:05 FrankW: :-)
Feb 26 20:21:13 Cristiano: FrankW: awesome
Feb 26 20:21:17 lossyrob: public s3?
Feb 26 20:21:27 FrankW: Well, not public yet, but that is quite doable.
Feb 26 20:21:43 lossyrob: I'll definitely be on the lookout for it!
Feb 26 20:21:57 jj0hns0n: this is correct, when I did this, I had naturalvue as the ultimate base and that was updated like once a year
Feb 26 20:21:58 Cristiano: mojodna: what's your concern about static tiles?
Feb 26 20:22:09 mojodna: too many combinations that update too frequently
Feb 26 20:22:26 lossyrob: mojodna, what does the dynamic story look like?
Feb 26 20:22:34 Cristiano: well... Planet will come up with an idea there :)
Feb 26 20:22:51 lossyrob: dynamic tile serving is tricky, for request response speed. gotta keep it as lean as possible
Feb 26 20:23:05 mojodna: dynamic overlays from desired sources
Feb 26 20:23:16 wildintellect: lets focus on getting the sets in and worry about what the global overview should be later
Feb 26 20:23:30 Cristiano: right, first principle of OAM: keep it simple
Feb 26 20:23:32 Cristiano: :)
Feb 26 20:23:34 mojodna: read only the bits that you’re going to render; this is why i was talking about 3857 source rasters being 1:1 with TMS output
Feb 26 20:23:52 wildintellect: the unique non-global sets is what we have that other don't
Feb 26 20:24:08 mojodna: wildintellect: yes
Feb 26 20:24:28 nhv: FrankW awesome :-)
Feb 26 20:25:31 wildintellect: mixing an matching TMS together is something for end users to play with for now
Feb 26 20:25:39 Cristiano: OAM started with the idea of allowing UAV/drone mappers to put their data somewhere to share.. and that type of data sources are still relavitvely small. But then of course we want NAIP and public orthos, and L8,...
Feb 26 20:26:04 wildintellect: sure but when you use naip, you typically just want NAIP
Feb 26 20:26:13 wildintellect: from a specific year set
Feb 26 20:26:18 FrankW: Lets nail the relatively local contributed dataset case well.
Feb 26 20:26:25 wildintellect: not a mix of naip inside of L8
Feb 26 20:27:07 lossyrob: Local contributed dataset - just a matter of uploading to an indexed S3 set?
Feb 26 20:27:28 mojodna: lossyrob: this is SRTM 90m rendered on a t2.micro reading data from S3 via FUSE: http://ec2-54-152-68-8.compute-1.amazonaws.com/#9/46.8733/-121.7615
Feb 26 20:27:39 mojodna: https://github.com/stamen/openterrain-map
Feb 26 20:27:52 mojodna: uses https://github.com/mojodna/tessera under the hood
Feb 26 20:28:04 lossyrob: Nice...hillshade computed dynamically?
Feb 26 20:28:20 mojodna: no, it’s treated as a derivative source
Feb 26 20:28:42 mojodna: source: 4326, deriv: 4326 hillshade, deriv: 3857 hillshade
Feb 26 20:28:45 lossyrob: ah. The second demo on geotrellis.io actually computes the hillshade dynamically based on user input for sun angle etc
Feb 26 20:29:09 mojodna: i’m intentionally staying as far away from analysis as possible
Feb 26 20:29:49 Cristiano: OK. well I'd like to learn more about your caching and load balancing ideas, let's continue over on the list. Unless you guys have other things you want to discuss today we ca probably wrap it up
Feb 26 20:29:50 wildintellect: Cristiano, we done for today?
Feb 26 20:29:59 Cristiano: Yes :)
Feb 26 20:30:05 lossyrob: mojodna gotcha. Is the S3 data public?
Feb 26 20:30:26 Cristiano: I'd like to thank everyone, this was a great discussion!
Feb 26 20:30:35 mojodna: ish. i need to move it somewhere more permanent and generate lists
Feb 26 20:30:42 mojodna: Cristiano: thank you!
Feb 26 20:30:44 FrankW: is overwhelmed by the options and new technologies!
Feb 26 20:30:51 wildintellect: mojodna, gonna need to do SRTM 30m soon
Feb 26 20:31:01 mojodna: wildintellect: that’s next (week)
Feb 26 20:31:05 lossyrob: yeah, great to talk to devs working on the samish problems, this was great!
Feb 26 20:31:08 wildintellect: I have a copy if you want
Feb 26 20:31:15 mojodna: wildintellect: in S3? ooh!
Feb 26 20:31:20 mojodna: yes, please
Feb 26 20:31:22 Cristiano: Lots of good ideas and I'm really excited to see all these people involved!
Feb 26 20:31:23 wildintellect: mojodna, both Winkey and I made a script to pull it from usgs
Feb 26 20:31:34 wildintellect: actuall at UCD or Telascience
Feb 26 20:31:46 mojodna: faster than USGS, one hopes
Feb 26 20:31:51 FrankW: has srtm30 on s3, but tragically also not yet public.
Feb 26 20:31:53 Cristiano: Cheers and have a good rest of the day/evening!
Feb 26 20:32:00 wildintellect: zipped bil, been meaning to batch convert to tiff
Feb 26 20:32:10 wildintellect: mojodna, you can easily pull a copy in a few hours
Feb 26 20:32:12 mojodna: wildintellect: i might be able to do that for you
Feb 26 20:32:29 wildintellect: sure it's not hard, just hasn't been a priority at work
Feb 26 20:32:55 mojodna: it lines up with proving out our approach to batch processing
Feb 26 20:33:04 wildintellect: fyi middle east is not included yet
Feb 26 20:33:29 jj0hns0n: FrankW is there global SRTM30 available now?
Feb 26 20:33:45 wildintellect: jj0hns0n, almost
Feb 26 20:33:46 FrankW: yes, though not quite completely global coverage
Feb 26 20:33:54 FrankW: missing parts of the middle east, southern russia.
Feb 26 20:33:55 wildintellect: middle east not released yet
Feb 26 20:34:02 wildintellect: http://data.biogeo.ucdavis.edu/library/elevation/srtm/
Feb 26 20:34:09 jj0hns0n: cool, that was all only 90 outside the US last time I really touched it
Feb 26 20:34:15 FrankW: http://e4ftl01.cr.usgs.gov/SRTM/SRTMGL1.003/2000.02.11
Feb 26 20:34:23 wildintellect: yup last Sept started rolling out 30m
Feb 26 20:34:58 jj0hns0n: lossyrob will have to chat about mrgeo v geotrellis someday, It was like 2010 when I last worked with it and only public now
Feb 26 20:35:01 FrankW: returns to struggling with s3 permissions and boto.
Feb 26 20:35:21 mojodna: FrankW: good luck!
Feb 26 20:35:26 jj0hns0n: we should get jason surratt to join us here someday
Feb 26 20:36:20 wildintellect: mojodna, Winkey and I are on #telascience on freenode if you want to talk srtm 30m - we hacked the wget script together to batch pull it from usgs
Feb 26 20:36:32 mojodna: wildintellect: cool, will do
Feb 26 20:36:50 wildintellect: I can't recall the telascience url to the files right now
Feb 26 20:36:57 wildintellect: but my server is reasonably fast
Feb 26 20:48:47 lossyrob: jj0hns0n: yeah! send me and email at rdemanuele@......com and we can talk about it