Content management tools fail, Plone marches on

A Jupiter study — I'm usually highly sceptical of these, but hey, maybe they're right for once — says businesses are often dissatisfied with their content management solutions. Complexity, lock-in, overengineering, ridiculous prices abound.

Meanwhile, in Plone land: Plone Improvement Process (a formalized approach to improvements: this sort of thing seems to be pretty popular in the larger free software projects these days, have you noticed?) has a few interesting improvement proposals for an already fine system: Oslo Skin system and Navigation and validation rewrite.

These coupled with improvements to and integration of CMFTypes and possibly the brand spanking new TTWType promise a bright future. It's not perfect and probably not suitable to all situations (there's a clear community site bias, but it doesn't have to be used that way) but it is easy, powerful and has a lot of momentum. And clearly there's still lots of space in the CMS field to conquer. Zope 3 is coming, but this Zope 2 thing is nowhere near its end of life.

Asynchronous fetching seems to be working

Coolness. Straw is actually successfully fetching the dozens of feeds I'm reading, asynchronously, without the troublesome network thread. DNS lookups with ADNS (ha, yes, another dependency), reading stuff off network via asyncore. And the UI is pretty responsive. It's still slower than it used to be, but I think that's a tradeoff I can make. And, as I said earlier, things can possibly be tuned to work out better.

However, one somewhat serious usability problem remains: the stuff is basically fetched off a queue, where things go like this: request an URL -> resolve the host asynchronously -> add the split URL + IP to the fetching queue -> fetch data asynchronously -> stuff the results into the internal datastructures. Now, when you press 'g' or when the timer triggers a polling run, we request all the RSS URLs, then when we have finally gotten something, we parse it, see if there are any images, and request those. See the problem? Unless name lookups are in some cases seriously slow, all the articles are fetched before any of the images. I suppose that's fine if you are interested only in the text, but if not, and you are sitting in front of your aggregator, waiting to read the words of the bloggerdom hot off their keyboards, you get to read the items without images.

So maybe I should prioritize the images higher. That is, stuff the image requests to the front of the queue. And I suppose that as long as we are talking data structures instead of, say, things you see in Helsinki in front of bars Friday and Saturday evenings, it won't be a queue any longer. Oh well. That's theory for you.

Asynchronicity

The asynchronous networking rework of Straw is progressing pretty nicely. I've already got the basics working: feeds are polled successfully and images are fetched. The only significant bit of code to be converted is the subscription tool / rssfinder.

It hasn't been totally easy. Converting an existing code base which depends on synchronous network operation to be asynchronous is a bit hairy in places. In fact, rssfinder is one beast which makes me consider leaving some synchronous networking in place, but we'll see about that.

Things aren't all peachy in the brand new asynchronous world, though; first of all fetching data over the network is, at the moment, noticeably slower than in 0.15. However, I suspect that tuning a few constants (timing and limits on open sockets) a bit might help here a lot. And Straw speaks HTTP 1.1, I could try using the same connections for multiple files, even though that would complicate the code a bit.

The second problem is that DNS lookups are still freezing the program. First thing to do, I guess, is to implement a name cache, so it'll be an one-time (per session) cost. But I suspect I'll have to do something about the lookup part itself. The options are, I guess, either using ADNS or forking a separate lookup process. I've been dependency-happy in the past, but I'll have to think what to do with this one; I have no idea how common ADNS is.

Oh, and a big THANK YOU to Fredrik Lundh for the series about building EffNews on his web site. I have no prior asynchronous networking programming experience, and his examples have been extremely helpful.

© Juri Pakaste 2024