Page 1 of 1

Recurring Words in MM

Posted: Thu Jan 22, 2009 3:23 pm
by gma
:idea: What are the top ten most oft recurring words in Mallets Mallet? :?:

(Yes I know, but it's this or my tax return, no brainer really!)

Re: Recurring Words in MM

Posted: Thu Jan 22, 2009 4:39 pm
by J.R.
You'd best start counting then, gma !!

:drinkers:

Re: Recurring Words in MM

Posted: Thu Jan 22, 2009 5:55 pm
by MKM
The commonest single word is "cake" which has appeared 25 times. Not sure what this tells us about the collective CH mind.

"out" 20 times
"bar" 19 times
"time" and "fish" 18 each
"ball", "cricket", "horse" and "over" 17 each
"bird", "board", "music", "water" and "wine" 16 each.

Re: Recurring Words in MM

Posted: Thu Jan 22, 2009 7:22 pm
by Ajarn Philip
There are 1,088 pages of MM nonsense. Mary, please tell me you have some secret new software that could give you that information in the blink of an eye!

Re: Recurring Words in MM

Posted: Thu Jan 22, 2009 7:36 pm
by Katharine
Phil, it's one of three things - you're right, Mary has too much time on her hands (doing her tax return?) or she's telling porkies. I can't accept the last as she is ex-Sixes so it must be one of the first two. :D :D

Re: Recurring Words in MM

Posted: Thu Jan 22, 2009 9:45 pm
by MKM
I do have time on my hands, as I took early retirement (on medical grounds) last year, but I didn't go through all the pages and count the words! The software I used is neither secret or new, just standard unix commands. I have Ubuntu installed on my laptop, and used perl to write a script to call wget repeatedly, to download all the Mallets Mallet pages (sorry, Julian I hope it didn't overload the server). Then just grep to pull out the right lines from each page (they all start with the same string of html), tr to convert it all to lower case, emacs to tidy up a bit, and remove the html, then sort, wc and sort again to find the most common words.

DR would be very scathing about me showing off like this.

It never occurred to me to make it up. :lol:

Re: Recurring Words in MM

Posted: Fri Jan 23, 2009 6:18 am
by Ajarn Philip
MKM wrote: I have Ubuntu installed on my laptop, and used perl to write a script to call wget repeatedly, to download all the Mallets Mallet pages (sorry, Julian I hope it didn't overload the server). Then just grep to pull out the right lines from each page (they all start with the same string of html), tr to convert it all to lower case, emacs to tidy up a bit, and remove the html, then sort, wc and sort again to find the most common words.
Well, if I'd known it was as easy as that I'd have done it myself... :shock: :shock: :lol:

Re: Recurring Words in MM

Posted: Fri Jan 23, 2009 8:10 am
by englishangel
MKM wrote:The commonest single word is "cake" which has appeared 25 times. Not sure what this tells us about the collective CH mind.

"out" 20 times
"bar" 19 times
"time" and "fish" 18 each
"ball", "cricket", "horse" and "over" 17 each
"bird", "board", "music", "water" and "wine" 16 each.
My son said this sounds like a perfect weekend.

Re: Recurring Words in MM

Posted: Sat Jan 24, 2009 3:28 pm
by Tim_MaA_MidB
Bar, bird, music and wine being the most significanT?

Re: Recurring Words in MM

Posted: Sat Jan 24, 2009 5:14 pm
by englishangel
He is 23