I have something with empathy. The following is an awesome talk by Jeremy Rifkin at the Royal Society for the Arts in UK, with animated Illustrations:
Author Archives: Arjen Lentz
Beyond Bayesian Filtering: Ranking
Spam filters can be roughly categorised in rule-based filters, Bayesian, or a combination thereof. As far as I’m concerned, good Bayesians are so effective that rule-based ones are not interesting. In particular since wrong/obsolete rules can trigger false positives (a good message being classified as spam, which IMHO is worse than seeing the occasional spam message).
Basically a bayesian spam filter analyses, using word (or word chain) occurrance, a message and calculates the probability that it’s spam. Then, using thresholds it decides whether the message is to be regarded as spam or not. If it is, it ends up in your spam folder. There is generally a feedback mechanism to enable learning, so you can toss things into the spam folder or grab stuff out, and that will adjust the corresponding probabilities. So in a nutshell, Bayesian spam filtering system is probabilistic, dynamic, and self-learning. See also the Wikipedia article providing background info on Bayesian spam filtering.
As mentioned, it does its job extremely well and most email apps use the system in some form, either implemented internally or by using an external tool through an API. Lovely.
What annoys me greatly is that the libraries and APIs appear to focus purely on (email) spam detection. In a way that’s lovely as a lot of work has been done for decoding MIME and HTML, and that helps. On the downside, when you feed in a message all you get out is a yes/no (spam or not), although some have a “maybe/unsure” return value also. It does the specific job, but I atcually want to know the original probability! Some do this in some form but again presume that they’re dealing with emails so they stick it in a header line. I just want to put in a chunk of text, and get out a probability. That’s all. I want less functionality, part of what’s getting done – not more…. funny, huh?
I am now thinking I might need to hack/extend/fork an existing library like bogofilter or dspam, but if you know of a library/API that delivers what I want, please do let me know! Given the functionality is essentially in there and the extensive work already done, it makes no sense to write a completely new implementation.
Filtering makes sense for email spam, you periodically check your spam folder to aid the learning process, and that’s it. But ranking would enable us to deal with a different problem: information overload.
First, let’s define this problem: we connect to mailing lists, news groups, forums, RSS feeds, facebook and Twitter feeds, and so on. We want info on a specific topic, but in the end we get too much info. Actually, each of those is somewhat selective: you pick a specific source: with mailing lists, forums and news groups you are interested in the basic topic, with an RSS feed you’re interested in whomever writes the blog or delivers the web site content, and facebook and twitter friends may have similar interests to you so chances are that what they notice, you might want to know about also.
It’s not entirely clear what bits of info in those feeds is of interest to you or not, but we already know that the total is too much to keep up with, so reading only part is the only way to go. With that in mind, I reckon it’d be great to rank information so that I see the more important stuff first, and then depending on time I can read more (towards the less interesting). Again, I can’t possibly read everything, so any help in selecting is good, right?
My current thoughts on what to do with this (and I’ve already been walking around with the idea for way to long):
- Find library/API that does exactly what I want, or extend/adapt one – submit back the changes, of course. Open Source benefits.
- Adapt feed readers to call the API and somehow apply or display the ranking, depending on their user interface. This could show as sort order, colour coding, etc, depending on context.
- Adapt feed readers to use feedback mechanism, enabling the backend system to learn my preferences.
- Using known connections (twitter/facebook friends), adjust otherwise possibly neutral probability to enhance/accelerate the learning process; implement these external interactions in such a way that individual privacy is not compromised.
Quite possibly there viable Upstarta style business opportunities. We’ll see. Key factors are always incentive and opportunity with the latter also dependent on capability. I have an incentive, as the idea comes from me thinking about how to resolve my own info overload “problem”.
Comments/suggestions welcome!
Smoothie with whey
One of the things you get left with when making cheese is whey – milk separates in to curds (the solids) and whey (the watery stuff). It’s full of protein, and depending on the cheese you get quite a lot (from 2L milk for haloumi you get over a liter of whey), so throwing it out would be quite wasteful.
You can just drink it, warm or cold, as a protein drink. Since I’m not really a fan of drinking plain milk (and this is almost the same), so I was looking at other options. It can be used as an ingredient in lots of recipes, of course. Now, I still had some mango/banana/strawberry smoothie left over, so I mixed it with the whey (often people mix yoghurt in to smoothies but I really don’t like yoghurt) – and it’s an excellent mix! Super-tasty, and very healthy.
Making haloumi
Today I made haloumi, from scratch! But let’s start from the beginning. I’m from Holland, and I’m a cheese head ;-) Ok fast-forward a bit. I grew up in the city (Amsterdam), although we used to get fresh milk straight from a farm nearby on the Amstel river. Anyhow, I never made cheese before, so with my interest in growing/making stuff from scratch: something to try!
I use the kits from Mad Millie which is a New Zealand based company, and they’ve also been very helpful and responsive when I had some questions. Their stuff can be found at homebrew stores, and interestingly the owner of the homebrew store near me is very knowledgeable so he must’ve been making different cheeses too!
I’ve successfully made a few cream cheeses, nicely herbed – no photos from that as cream cheese is not particularly photogenic anywhere in the process. The end result is just yum!
I actually had two previous attempts at haloumi but the milk wouldn’t separate even though temperature and other factors were right. Humidity might have been an issue though. The friendly people at Mad Millie suggested to add some calcium chloride, so this time I did. Success!
I used two litres of milk and got almost 250g of haloumi + about a litre of “protein drink” (the whey, it’s quite tasty!)
The milk is warmed to 45’C using the microwave, then a bit of calcium chloride and diluted rennet gets stirred in before putting it in a temperature stable incubator (fancy word for my old big esky) for about half an hour. Then cut in to approx 1cm blocks and leave it for a little longer. Then the photoreportage begins. Mouse-over for some more detail on the pictured steps.
Next cheese type: feta.
Chess Miss Phoebe
Yep, five years old and now playing chess. since yesterday (31 Oct 2010, for the record). Why not. I’d put out a chess set on my coffee table, part decoration & part to see what she’d think about it (or for others on the pot luck night to play)… and within several hours Phoebe asked me to teach her. Ok!
We first went over the names of the pieces and their basic moves… of course you can make up a cool story about a castle, so that’s all pretty easy to remember. The moves are not too difficult except for the knight (horse) which can be a bit tricky to get right.
We’ve already played two games since, and she’s doing pretty well. With her story mind, she’s decided she doesn’t like the bishops much and so she’s giving them away early on in the game. I said that’s probably not a good idea as she might need those pieces even if she don’t like them – but heck she can play lots and figure it all out in her own way. Exploring is fun, and using your brain at the same time is fabulous!
(apparently my good friend Georg Richter -chess grand master or somesuch- started at age 4, so there ya go)