2012-09-17

My Hadoop-related Speaking Schedule

I'm back from the US, where I had lots of fun getting the HA HDP-1 stuff out the door -I know about Linux Resource Agents, and too much about Bash -though that knowledge turns out to be terrifyingly useful.

Here's a pic of me sitting outside a cabin in Yosemite Valley where we spent a couple of nights -Camp 4 wasn't on the permitted accommodation list this time.
Curry Camp Cabin, Yosemite

Some people may be thinking "cabin?" "Yosemite?" and "Isn't that where all those people caught Hantavirus and died?". The answer is yes -though they were in wooden-walled tent-things about 100 metres away, and the epidemology assessments show that even for them the risk is very small. The press like headlines like "20,000 people may be at risk" -missing the point that the larger the set of people "present" for the same number of "ill", the smaller P(ill | present). Which is good as P(die | ill)=0.4.

Even so,  I've had some good discussions with the family doctor and the UK Health Protection Agency, who did write a letter saying "if you show symptoms of flu within 6 weeks of visiting, get to a hospital for a blood test". As the doctor said "we don't get many cases of Hantavirus in Bristol", so it's not something they are geared up for. You know that when they start looking at the same web pages you've already read.

Well, we've got 1-2 weeks left to go. And it was excellent in Yosemite, though next time I'd stay more in Tolumne Meadows than in the valley itself (too busy), and maybe sort out the paperwork to go back-country. 

Yosemite

Assuming that I remain alive for the next fortnight, here are where I'm going to be speaking over the next few months.

Strata EU: Data Availability and Integrity in Apache Hadoop.

I've already done a preview of the talk at a little workshop in Bristol -the live demo of RHEL HA failover did work, so I hope to repeat it. I'll be manning the Hortonworks Booth and wearing branded T-shirts, so will be findable -though I plan to attend some of the talks. In particular, one of the people behind Spatial Analyis UK will be talking -and I just love their maps.

Big Data Con London, Hadoop as a Data Refinery.

Here I'll be exploring the "Data Refinery" metaphor as a way to visualise and communicate the role of the Hadoop stack in existing organisations.

ApacheCon EU, Introduction to Hadoop-dev.

I'm going to talk about the Hadoop development process, QA and testing, contributions. This isn't going be a basic "here's SVN", or a "Hortonworks and Cloudera can handle everything" talk, but one that looks at the current process -both strengths and weaknesses. As a committer who was not only on their own for some years, but still in a different TZ, I know the problems that arise. I believe it is essential for people using Hadoop in the field to get their feedback in, through JIRA, tests & patches. If there is one thing that I think needs work is to have a semi-formalised process for external projects to do mentored work relating to Hadoop. That's companies, individuals, interns and university research. All to often we don't know that someone is working on a feature until they turn up with something big that cuts across the projects -and at that point it's too late to shape, to open up to external input, or to even comprehend. Just as apache has an incubator, I think we need something structured -as the alternative is that this work falls on the floor and ends up wasted.


No comments:

Post a Comment

Comments are usually moderated -sorry.