Today I woke up at 5:30 and wisely chose not to go back to sleep because then the whole day would go to the dogs. Anyway, it’s 10 now and I have already achieved some things. I parsed the source code of an internal partner webpage (because this would be faster than waiting for them to give us the data in the format we want and they have some WEIRD data formats in there) which means I finally got to dip my feet in regular expressions, something I’ve been putting off in favor of doing lame substring() and split() calls all over the place [eww]. Anyway, that was fun. I like the stringbehind operator as explained here by Gwen.
After parsing the data I loaded it into the database using this tidy script.
drop table sensors; CREATE TABLE sensors ( id varchar(50), lon double precision, lat double precision); \copy sensors(id,lon,lat) FROM 'sensors.csv' DELIMITERS ',' CSV HEADER; alter table sensors add column the_geom geometry; update sensors set the_geom=ST_SetSRID(ST_MakePoint(lon,lat),4326); CREATE INDEX idx_sensors_geom ON sensors USING GIST(the_geom); -- create index, just in case
Now I can analyze those things in QuantumGIS – for which I ordered a book yesterday – this one here by Anita Graser – and hope to get some pointers on doing batch processing and stuff like that.
I am at the moment struggling with making the solution of a variant of “HOW DO I RETURN THE ROW HAVING THE MAX OF ITS GROUP, LOL?” finish in reasonable time because I have “big data”, a couple of joins, and it really sucks. It is weird that this thing is so hard to do in SQL, it’s not exactly an obscure thing that nobody needs. It would be cool to be able to get some query time estimates for postgres so that I know a query is doomed after waiting a bit.
I also need to write a simple map reduce job which should probably take precedence. I also want to learn what stored procedures are exactly (yesterday I wrote my first sql functions and it’s really good for my sanity – goodbye copy pasting 5 lines of SQL from txt files I forget where I saved them the next day).