

(Yes I know, but it's this or my tax return, no brainer really!)
Moderator: Moderators
Well, if I'd known it was as easy as that I'd have done it myself...MKM wrote: I have Ubuntu installed on my laptop, and used perl to write a script to call wget repeatedly, to download all the Mallets Mallet pages (sorry, Julian I hope it didn't overload the server). Then just grep to pull out the right lines from each page (they all start with the same string of html), tr to convert it all to lower case, emacs to tidy up a bit, and remove the html, then sort, wc and sort again to find the most common words.
My son said this sounds like a perfect weekend.MKM wrote:The commonest single word is "cake" which has appeared 25 times. Not sure what this tells us about the collective CH mind.
"out" 20 times
"bar" 19 times
"time" and "fish" 18 each
"ball", "cricket", "horse" and "over" 17 each
"bird", "board", "music", "water" and "wine" 16 each.