Tuesday, September 08, 2009

More from DjangoCon

I joined Lincoln Loop guys for lunch. This is a well known Django shop based in Colorado, although the developers are spread out, including this one guy from Portugal (we discussed Portland's planned aspect, strong zoning, with a tightly focused CBD).

Every subculture has its "buzz word city" which I work to tune in (anthropologist hat). Here it's Fabric and Pinax, also pip and virtualEnv (both Ian Bicking projects) -- gotta know about 'em. Also Satchmo and Lightning Fast Shop.

These latter two on-line shopping frameworks are the kinds of free tools Stu Quimby was experimenting with, when coding Design Science Toys to do more business over the web.

My side of the conversation drifted into medical uses of SQL and ~SQL (not-SQL). I reiterated my view that legal medical records (LMRs) would go document-oriented non-SQL, because you want to stay schemeless in such an amorphous environment.

No one wants a table structure expected to handle every medical eventuality and/or course of treatment.

However, for outcomes research purposes, when comparing apples with apples, oranges with oranges, you'll want to harvest clinical research records (CRRs) from raw data, scrub it clean by masking true identities, to build these valuable clinical datasets of a more traditional schema-based nature, suitable for sharing among doctors and statisticians. Heads nodded around the table.

James Tauber's talk on Pinax replays the "gateway drug" meme. He's also a math geek I see (@jtauber).

Pinax core-dev how: there's a task tracker, wiki, pastebin, 'new' vs. 'accepted' workflow, 'fix needs review' tag, 'resolved' vs. closed' status. This infrastructure dovetails with discussions on Diversity@python.org re the Postgres core-dev process, versus Python's (Selena contributing).

github
is the repository for Pinax. #pinax-dev the IRC channel, #pinax the Twitter channel.

Andy McKay (@clearwind) is next up. "What the heck went wrong?" is his topic. Andy has been a strong player in the Plone community over the years, good seeing him again (we talked later in the lobby, mostly about conference organization issues). He's talking about debugging Django apps (how to).

First: use the dev server, set DEBUG = True. Use assert False and print statements as primitive debugging stubs. Remember to print the type as everything prints as a string. mod_wsgi will halt if you print. Fix: WSGIRestrictStdout Off.

Next level: logging. The logging module in the Standard Library is notoriously complex, e.g. (log.debug("blah blah")). Then comes pdb (import pdb; pdb.set_trace()). pdb is essential. The werkzeug debugger is also really cool.

Next: unit tests and Django debug toolbar. Continuous integration products are good for keeping costs down: TeamCity, Tinderbox...

Arecibo is a good example of a live error monitoring service.

RESTful Ponies by Mike Malone (@mjmalone), currently with Six Apart: Roy T. Fielding came up with REST in 1999, per his doctoral dissertation. It's an architecture, like Bauhaus, Baroque, or Functionalist. WWW, based on HTTP, conforms to REST. Representational State Transfer is also like a programming paradigm, like OO. RESTocity: uniform interface, statelessness.

A resource is something that can be named, like an object in Python. On the web, these are typically URIs. Terminology: a representation is a sequence of bytes on the wire, describing a resource. Hypermedia contain hyperlinks. Application state affects how a request is processed. This isn't the same as resource state.

RESTful web services should use HTTP 1.1 as defined in RFC 2616. This is your API. Give resources URIs. GET, POST, PUT and DELETE are your four essential methods. PATCH is in the wings (has momentum, not here yet). He's using curl for demos. Status codes define how to handle requests and responses.

Media types are part of the content-type in HTTP. Examples: JSON, RDF: FOAF; XML (e.g. Atom) etc.

You also need to specify cacheability using the Cache-Control HTTP header. Django has conditional GET middleware, but you still need to manage Etags and Last-Modified yourself (e.g. use Python hashlib, md5). These same techniques are used to establish preconditions. Statelessness means don't use sessions or cookies. All state information the server might need goes via the hypermedia. django-piston by Jesper Noeher is worth studying for RESTful Django.

Simon Willison: Renting old castles and forts is a fun way to host a sprint. WildLifeNearYou.com was co-developed over 10 days using "cowboy development" techniques. Consensus processes are important. Pair programming helps a new team achieve common coding standards. Shared blogs, Twitter-like programs also help coordinate projects. The fort team hacked Bugle, had no Internet connection, just an intranet.

Simon, who works for The Guardian is giving us his British-style spin on that topic (rapid application development). The scandal around MP expenses provided raw material. The result: a Django application for investigating these revelations. The plan was to break scanned A4 pages into individual images using convert and send them to a MySQL database. In the end, other tools were required given the non-standard nature of the PDF materials.

This process was educational but also proved "cowboy development" has some real downsides. The better way: use pair programming, unit tests, mocks, Team City, deployment scripts. He's mentioning redis (link below), with its atomic data structures.

Our day is concluding with the technical panel. I need to bus home to grab the PSF snake and head down to the Portland Python User Group meeting. @psf_snake has tweeted invites to the conferees, with a link to the Meetup site.