Effect of dictionary cull on ODP/CSW commonality/differences

All discussion relevant to Countdown that is not too spoilerific. New members: come here first to introduce yourself. We don't bite, or at least rarely.
Post Reply
Fiona T
Kiloposter
Posts: 1851
Joined: Mon Mar 18, 2019 12:54 pm

Effect of dictionary cull on ODP/CSW commonality/differences

Post by Fiona T »

I did some analysis on the commonality/differences between ODP and CSW before and after the ODP cull to see if it helps or hinders...

I've only considered words between 2 and 9 letters long

If I've got this right...

Pre-cull

ODP - 130262 words

32417 are not in csw21
97845 are in csw21
484 added by csw24 (and 2 removed)

Post-cull

ODP 90707 words

5767 are not in csw21
84940 are in csw21
410 added by csw24 (and 2 removed)


CSW21 - 162192 words

Not in ODP post cull 77352
In ODP post cull 84940
Not in ODP pre-cull 64347
In ODP pre-cull 97845

CSW24 - 1137 new words, 4 removed

new words not in ODP pre-cull 653
new words in ODP pre-cull 484
new words not in ODP post cull 727
new words in ODP post-cull 410

2 words removed both pre and post cull (TRANSMAN, TRANSMEN)


In conclusion :)

75% of the pre-cull lexicon is valid in CSW24
94% of the post-cull lexicon is valid in CSW24

60% of CSW24 is valid in pre-cull lexicon
52% of CSW24 is valid in post cull lexicon

So post-cull you're safer risking your countdown word in scrabble, but less safe risking your scrabble word in countdown. HTH :)

e&oe
Gavin Chipper
Post-apocalypse
Posts: 14319
Joined: Mon Jan 21, 2008 10:37 pm

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Gavin Chipper »

Good info.
User avatar
Jon O'Neill
Ginger Ninja
Posts: 4588
Joined: Tue Jan 22, 2008 12:45 am
Location: London, UK

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Jon O'Neill »

Cool.
Fiona T
Kiloposter
Posts: 1851
Joined: Mon Mar 18, 2019 12:54 pm

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Fiona T »

I was kinda interested in the 78 words that CSW added this year that ODP zapped at the same time

A lot of them look like proper modern words, with the scrotum featuring disproportionally!
Wonder if we'll see some of them return over the next months...

(Full list if anyone wants to play re-add bingo)

ABYED
ACIDAEMIA
AGGY
AMBIGRAM
AMBIGRAMS
ANGSTING
BASA
BASAS
BAWBAG
BAWBAGS
BIOSECURE
COULDA
CRYPSIS
EMPING
FONIO
FONIOS
GATEKEPT
GLAMP
GLAMPED
GLAMPS
GOETTA
GOETTAS
HOMEGOING
MAGSTRIPE
MASCULISM
MASULAH
MASULAHS
MEGAPOLIS
MEMBRILLO
METAPHONY
MIDDER
MONOMYTH
MONOMYTHS
MULTIHIT
NATTO
NATTOS
NUTBALL
NUTBALLS
NUTSACK
NUTSACKS
OMNICIDE
OMNICIDES
ONGLET
ONGLETS
PANDEIRO
PANDEIROS
PANTSING
PEATED
PIZZAIOLO
PNICOGEN
PNICOGENS
PNICTOGEN
POMPOMMED
PSALTERER
QUINCH
ROUTABLE
SALINATE
SALINATED
SALINATES
SARCODINE
SIMIT
SIMITS
SKEEZY
SKUNKBUSH
STACHE
STACHES
STUFFIE
STUFFIES
SUPERTASK
UNMALTED
WAGWAN
WOULDA
ZAATARS
ZEDONK
ZEDONKS
ZEEDONK
ZEEDONKS
ZEPPOLIS
User avatar
Jon O'Neill
Ginger Ninja
Posts: 4588
Joined: Tue Jan 22, 2008 12:45 am
Location: London, UK

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Jon O'Neill »

Fantastic.
Fiona T
Kiloposter
Posts: 1851
Joined: Mon Mar 18, 2019 12:54 pm

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Fiona T »

Jon O'Neill wrote: Wed Oct 09, 2024 7:09 amCool.
Jon O'Neill wrote: Wed Oct 09, 2024 10:18 am Fantastic.
Image
Gavin Chipper
Post-apocalypse
Posts: 14319
Joined: Mon Jan 21, 2008 10:37 pm

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Gavin Chipper »

Fiona T wrote: Tue Oct 08, 2024 4:02 pm

Pre-cull

ODP - 130262 words

...

Post-cull

ODP 90707 words

By the way, a lot was made of the new additions (back in 2015 or whenever) like it would completely change max scores etc., and I got the impression that the wordlist was increasing by 10 times or something. But assuming that the post-cull list is something like what it was originally, it's not that big a difference at all. Less than 44% increase.
Gavin Chipper
Post-apocalypse
Posts: 14319
Joined: Mon Jan 21, 2008 10:37 pm

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Gavin Chipper »

OK, so according this Apterous ticket from 2015, the number of headwords went up from about 140,000 to about 600,000 so approximately a quadrupling. The 10 times thing in my last post was an exaggeration, but this 4 times figure seems like what I remembered from the time. So why the big difference?

Could it relate to words longer than 9 letters which the Apterous ticket might have included but not this thread? I don't see any reason why the ratio difference would be so much.

Edit - I had been aware of the discrepancy previously but only recently as Graeme posted this analysis, but I didn't get round to posting about it at the time, and this thread reminded me. I feel like I've been under completely the wrong impression about the Countdown dictionary for nearly 10 years. It's still a massive load of words, but nothing like what I thought.
Thomas Cappleman
Series 72 Champion
Posts: 370
Joined: Fri Nov 07, 2008 9:42 pm

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Thomas Cappleman »

Ray's comment there is "potentially bringing the word count from the ODO's ~140000 entries to the ODE's whopping ~600000 entries" - if they'd taken everything from the full OED. But it was just some (semi-)random selection of it, and then partially reverted in some instances.

If it had been like you'd thought, you'd have about 80% of maxes being words you'd never heard of before the update.
Gavin Chipper
Post-apocalypse
Posts: 14319
Joined: Mon Jan 21, 2008 10:37 pm

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Gavin Chipper »

True, but I don't think there was ever any rowing back of the implications by anyone. I don't recall anyone ever saying "Oh, it's not that big after all" or putting the actual numbers (until they were removed again). I feel misled by the whole thing.
Fiona T
Kiloposter
Posts: 1851
Joined: Mon Mar 18, 2019 12:54 pm

Re: Effect of dictionary cull on ODP/CSW commonality/differences

Post by Fiona T »

There were 96,178 removals as per https://www.apterous.org/ticket_view.php?ticket=6969, more than half the lexicon

So the majority were > 9 letters and not included in my analysis. But yeah CSW is far crazier.
Post Reply