Which words are the most useful to know?

All discussion relevant to Countdown that is not too spoilerific. New members: come here first to introduce yourself. We don't bite, or at least rarely.
Post Reply
Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Which words are the most useful to know?

Post by Robert Foster » Mon Jul 09, 2018 9:10 pm

I did an analysis on which words would be most useful to learn for each of 3, 4, and 5 vowel picks in standard letters rounds.
You can find the results here:
http://bit.ly/2KXffJw

Download the file as an Excel spreadsheet (don't try and open it as a Google Sheet) and feel free to play around - you can use it as a study tool or just for reference. Hope you fellow word geeks find it useful!

Paul Erdunast
Series 74 Champion
Posts: 144
Joined: Fri Mar 20, 2009 10:59 pm

Re: Which words are the most useful to know?

Post by Paul Erdunast » Tue Jul 10, 2018 3:50 am

This is really great!

Tim Down
Acolyte
Posts: 109
Joined: Fri Feb 12, 2016 9:45 am

Re: Which words are the most useful to know?

Post by Tim Down » Wed Jul 11, 2018 11:28 am

Robert Foster wrote:
Mon Jul 09, 2018 9:10 pm
I did an analysis on which words would be most useful to learn for each of 3, 4, and 5 vowel picks in standard letters rounds.
You can find the results here:
http://bit.ly/2KXffJw

Download the file as an Excel spreadsheet (don't try and open it as a Google Sheet) and feel free to play around - you can use it as a study tool or just for reference. Hope you fellow word geeks find it useful!
Rob, this is brilliant. Thank you so much.

User avatar
Jon O'Neill
Ginger Ninja
Posts: 4361
Joined: Tue Jan 22, 2008 12:45 am
Location: London, UK

Re: Which words are the most useful to know?

Post by Jon O'Neill » Wed Jul 11, 2018 12:33 pm

Yeah, this is great. Nice work.

Only thing I would suggest as an improvement is to do some face-up shuffling in the simulated rounds (or weight against rounds with double or triple letters). It seems to me that the most useful words seem a bit skewed towards double-letters compared to https://www.apterous.org/wordstats.php?variant=0&dic=0

Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Re: Which words are the most useful to know?

Post by Robert Foster » Thu Jul 19, 2018 6:11 am

Thanks guys!
Jon O'Neill wrote:
Wed Jul 11, 2018 12:33 pm
Only thing I would suggest as an improvement is to do some face-up shuffling in the simulated rounds (or weight against rounds with double or triple letters). It seems to me that the most useful words seem a bit skewed towards double-letters compared to https://www.apterous.org/wordstats.php?variant=0&dic=0
Good point - I always assumed the shuffling was totally random like at a co:event but this looks not to be the case. I uploaded a second version which decreases the chance of two of the same letter being drawn successively: http://bit.ly/2KXffJw

Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Re: Which words are the most useful to know?

Post by Robert Foster » Wed Mar 25, 2020 8:50 am

Latest version is here! You have to download it because the file won't show up in preview. Make sure to enable editing and macros if prompted.

https://bit.ly/2Qykkto

JackHurst
Series 63 Champion
Posts: 1505
Joined: Tue Jan 20, 2009 8:40 pm
Location: Leics

Re: Which words are the most useful to know?

Post by JackHurst » Fri Apr 10, 2020 9:45 pm

Great work Rob!

User avatar
Matt Morrison
Post-apocalypse
Posts: 7659
Joined: Wed Oct 22, 2008 2:27 pm
Location: London
Contact:

Re: Which words are the most useful to know?

Post by Matt Morrison » Fri Apr 10, 2020 10:46 pm

GARB

Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Re: Which words are the most useful to know?

Post by Robert Foster » Thu Nov 26, 2020 9:54 pm

Latest version of my word usefulness doc following the Nov 2020 update:

https://drive.google.com/drive/folders/ ... OBoD4LcS_4

Dan Byrom
Rookie
Posts: 35
Joined: Sun Apr 16, 2017 2:42 pm

Re: Which words are the most useful to know?

Post by Dan Byrom » Fri Nov 27, 2020 11:46 am

Beautiful Rob. Thanks!

JackHurst
Series 63 Champion
Posts: 1505
Joined: Tue Jan 20, 2009 8:40 pm
Location: Leics

Re: Which words are the most useful to know?

Post by JackHurst » Fri Nov 27, 2020 4:36 pm

Robert Foster wrote:
Thu Nov 26, 2020 9:54 pm
Latest version of my word usefulness doc following the Nov 2020 update:

https://drive.google.com/drive/folders/ ... OBoD4LcS_4
Lovely jubbly

Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Re: Which words are the most useful to know?

Post by Robert Foster » Fri Nov 27, 2020 5:57 pm

Updated version thanks to suggestions from Callum and Fiona. A few changes:

- The logic behind the word that appears in 'Single Word' column has been changed - for words with multiple anagrams, it should state the most useful to learn of these. (Roughly the logic is "prefer words that take useful letters on the end like S and D; if still a tie, prefer words which are still words after removing the last letter; if still a tie, prefer the word that is most frequent in real life usage.

- An extra column has been added listing the letters that the Single Word takes on the end to form other words (so 'drs' for BREATHE)

- Default picking ratios have been updated to closer reflect that of apterous. This has resulted in a different ordering of the most useful words. If you preferred the original listing you can just change your picking preferences back to (1/3, 1/3, 1/3)

User avatar
Jon O'Neill
Ginger Ninja
Posts: 4361
Joined: Tue Jan 22, 2008 12:45 am
Location: London, UK

Re: Which words are the most useful to know?

Post by Jon O'Neill » Fri Nov 27, 2020 6:10 pm

I've said it before and I'll say it again. This really is good.

Fiona T
Enthusiast
Posts: 475
Joined: Mon Mar 18, 2019 12:54 pm

Re: Which words are the most useful to know?

Post by Fiona T » Fri Nov 27, 2020 7:54 pm

Robert Foster wrote:
Fri Nov 27, 2020 5:57 pm
Updated version thanks to suggestions from Callum and Fiona. A few changes:

- The logic behind the word that appears in 'Single Word' column has been changed - for words with multiple anagrams, it should state the most useful to learn of these. (Roughly the logic is "prefer words that take useful letters on the end like S and D; if still a tie, prefer words which are still words after removing the last letter; if still a tie, prefer the word that is most frequent in real life usage.

- An extra column has been added listing the letters that the Single Word takes on the end to form other words (so 'drs' for BREATHE)

- Default picking ratios have been updated to closer reflect that of apterous. This has resulted in a different ordering of the most useful words. If you preferred the original listing you can just change your picking preferences back to (1/3, 1/3, 1/3)
Brilliant - thanks Rob!
8-) <-2m-> 8-)

User avatar
Charlie Reams
Site Admin
Posts: 9473
Joined: Fri Jan 11, 2008 2:33 pm
Location: Cambridge
Contact:

Re: Which words are the most useful to know?

Post by Charlie Reams » Sun Nov 29, 2020 5:02 pm

This is very cool! Any chance you could share it in some generic form (like CSV) for those of us who don't have Excel?

Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Re: Which words are the most useful to know?

Post by Robert Foster » Sun Nov 29, 2020 6:45 pm

Cheers! I've uploaded the CSVs for each tab now.

https://drive.google.com/drive/folders/ ... OBoD4LcS_4

JackHurst
Series 63 Champion
Posts: 1505
Joined: Tue Jan 20, 2009 8:40 pm
Location: Leics

Re: Which words are the most useful to know?

Post by JackHurst » Sun Nov 29, 2020 7:28 pm

Robert Foster wrote:
Sun Nov 29, 2020 6:45 pm
Cheers! I've uploaded the CSVs for each tab now.

https://drive.google.com/drive/folders/ ... OBoD4LcS_4
Lovelier jubblier

User avatar
Charlie Reams
Site Admin
Posts: 9473
Joined: Fri Jan 11, 2008 2:33 pm
Location: Cambridge
Contact:

Re: Which words are the most useful to know?

Post by Charlie Reams » Sun Nov 29, 2020 9:16 pm

Thank you so much!

I'm really enjoying the collaborative improvement of these lists. One thing I wonder is how to capture the relative impact of spotting/missing words of different length. In particular, knowing 9s might be much more important than it seems because there's a 36-point swing at stake (versus 16 or less for all shorter words). To investigate this, I pulled some data from recent human-vs-human apterous games (about 2 million rounds total) on the average net impact in terms of max length:

Code: Select all

Length  Expected net points for...
        Missing  Spotting
4       -1.57    +0.70
5       -1.76    +1.20
6       -1.88    +1.62
7       -1.96    +2.07
8       -1.75    +2.98
9       -3.66    +7.13
(To give an example of what this table means, if the max in a letters round is 8 and you miss it then you expect your opponent will gain about 1.75 points over you on average, whereas if you spot it then you expect to gain about 2.98 points over them. Obviously these are just averages and could look quite different depending on the skill level of the players and the specific selection involved.)

The main thing this shows very clearly is that knowing nines is much more important than it might seem in terms of frequency. Maybe one way to incorporate this information into the ranking is to say that one extra 9 is worth 10.79 (the spread from +7.13 to -3.66) versus, say, 4.73 for an eight, and weight all the frequency data accordingly.

Anyway this is great so let me know if there's any other data I can pull that would help.

JackHurst
Series 63 Champion
Posts: 1505
Joined: Tue Jan 20, 2009 8:40 pm
Location: Leics

Re: Which words are the most useful to know?

Post by JackHurst » Mon Nov 30, 2020 10:59 am

Charlie Reams wrote:
Sun Nov 29, 2020 9:16 pm
I pulled some data from recent human-vs-human apterous games
I think a really interesting (but possibly difficult to calculate) piece of data from apterous would be to see a list of words most frequently missed that would have change the outcome of a game from loss to win if other rounds in the game stayed the same.

e.g. if Bob loses 99 - 100 to Alice, and RD 13 was:

Selection: AEIORDNTS
Bob: STAINED
Alice: STRAINED

Then any 8s or 9s from that selection get another mark next to them on the list.

I guess you'd calculate if as follows:
- Find all games with a margin (M) of < 26*
- In those games, find all of the rounds where a user did not get a letters max
- For each word available in that round that is better than what the loser declared, calculatethe swing (S) for that word
- For each word where S > M, add that word to the list
- *26 because the largest possible swing in a letters round is +18 pts for you, -8 for the opponent.

Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Re: Which words are the most useful to know?

Post by Robert Foster » Mon Nov 30, 2020 11:41 am

Nice sleuthing Charlie, I may take up your offer of juicy apto data! I should have mentioned that in my doc, the words are already weighted according to how many points they score (x18 for a 9, x8 for an 8 etc).

It turns out that my crude weightings are actually pretty close to the ones generated by your method (I've inflated yours so they can be compared more easily with mine):

Code: Select all

	points...		weightings...	
length	lost	gained	swing	Charlie's	Rob's
4	-1.57	+0.7	2.27	3.85    	4
5	-1.76	+1.2	2.96	5.02    	5
6	-1.88	+1.62	3.5	5.94    	6
7	-1.96	+2.07	4.03	6.84    	7
8	-1.75	+2.98	4.73	8.03    	8
9	-3.66	+7.13	10.79	18.31    	18
So perhaps 9s in my list are slightly underrated, but not by a massive amount.

I've added a readme that explains a few other things about the doc.
https://drive.google.com/drive/folders/ ... OBoD4LcS_4
Last edited by Robert Foster on Tue Dec 01, 2020 9:50 am, edited 1 time in total.

User avatar
Charlie Reams
Site Admin
Posts: 9473
Joined: Fri Jan 11, 2008 2:33 pm
Location: Cambridge
Contact:

Re: Which words are the most useful to know?

Post by Charlie Reams » Mon Nov 30, 2020 10:31 pm

Robert Foster wrote:
Mon Nov 30, 2020 11:41 am

Code: Select all

	points...		weightings...	
length	lost	gained	swing	Charlie's	Rob's
4	-1.57	+0.7	2.27	3.85    	4
5	-1.76	+1.2	2.96	5.02    	5
6	-1.88	+1.62	3.5	5.94    	6
7	-1.96	+2.07	4.03	6.84    	7
8	-1.75	+2.98	4.73	8.03    	8
9	-3.66	+7.13	10.79	18.31    	18
So perhaps 9s in my list are slightly underrated, but not by a massive amount.

I've added a readme that explains a few other things about the doc.
https://drive.google.com/drive/folders/ ... OBoD4LcS_4
Super interesting! So I think that's a good argument for mostly studying to hit maxes and not worrying about the "secondary effect" of words that will probably score even when they aren't the max, which means your lists are probably even more useful than I thought.

User avatar
L'oisleatch McGraw
Enthusiast
Posts: 388
Joined: Sun Dec 13, 2015 2:46 am
Location: Waterford
Contact:

Re: Which words are the most useful to know?

Post by L'oisleatch McGraw » Mon Jan 18, 2021 4:46 pm

If I sort the document by Word length (i.e. Column D), so that it lists all the 9s first, the the 8s etc; the data in columns O and R disappears. Is there a way to sort it that way without that happening?
:arrow: :arrow: :arrow: S:778-ochamp

Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Re: Which words are the most useful to know?

Post by Robert Foster » Mon Jan 18, 2021 8:03 pm

Sorting by length seems to work OK for me. I did:

Highlight columns B-R (or columns B-Q in the latest version) > Sort & Filter > Custom Sort > Tick the 'My data has headers' box > Column D Largest to Smallest.

Also 'Takes' information only shows for words of 8 letters and below, so if you have all of the 9s at the top of the list then you won't see anything until you scroll down to the 8s.

User avatar
L'oisleatch McGraw
Enthusiast
Posts: 388
Joined: Sun Dec 13, 2015 2:46 am
Location: Waterford
Contact:

Re: Which words are the most useful to know?

Post by L'oisleatch McGraw » Tue Jan 19, 2021 5:45 pm

Thank you. I am not familiar with excel. I didn't tick the relevant box... and may have done other things wrong, not sure. But it is working now.
By the way, I don't know if this is any use to you for future editions, but what I did was included Column A when resorting the data, then added a new column showing "rank by letter count"... and in that numbered all 31583 nine letters words by usefulness, then all 28617 eights, and so forth. And the benefit of having the rank numbers in column A, is you can compare the relative usefulness of the 1000th best nine with the 1000th best eight etc.

Handy for someone who likes to learn say, the first 100 most useful nines, then the first 100 eights, then 7s, and then go back and look at the next 100 nines... etc.

Thanks again for the help.
:arrow: :arrow: :arrow: S:778-ochamp

User avatar
L'oisleatch McGraw
Enthusiast
Posts: 388
Joined: Sun Dec 13, 2015 2:46 am
Location: Waterford
Contact:

Re: Which words are the most useful to know?

Post by L'oisleatch McGraw » Wed Jan 20, 2021 5:18 am

I spent this evening getting to grips with your word list... and I have some observations to make. What I am about to write may not be of great relevance, but in case I have stumbled upon something useful that could help you, I am sharing.

Back in 2017 I made my own nerd list in preparation for going on telly. Now, this was in no way as statistically sound or thorough. It was more a patchwork quilt drawn from many sources including gut feeling, experience of commonly occurring words on Apto and CD, Apto's 'most missed lists etc, with a little bit of input from Paul Erdunast's word playability list (he hates when I call it "The NERDUNAST", but... how could you not?! :mrgreen: ) It would have had a LOT more input from Paul's, but it was very close to my TV appearance when I got that list, so had to make some very rushed adjustments to include at least some of his work...

Anyhow, my list is about words that a mid-level Apterite, without natural anagramming flair, cannot readily see in a selection... and it gives crutches (falseagrams / stemmers) to make them more visible. So this would mean some words (very few) that are easy to spot (like RELATIONS) are not on the list at all.

In all, the list contains 77 six letter words, 327 seven letter words, 465 eight letter words, and 370 nine letter words, rated roughly according to perceived usefulness.

I've been cross referencing my old list with yours (I've only looked at 9-letter words so far), and in doing so, found a few outliers.


Firstly there are the words I had listed as high importance, that were lower down on yours:-

WORD ........................ EOIN's APPROX RANK ........................ ROB's APPROX RANK
BOTANISES ........................ 1st - 100th range ........................ 800th - 900th range
CORTISONE ........................ 1st - 100th range ........................ 600th - 700th range
GINORITES ........................ 1st - 100th range ........................ 300th - 400th range
ARTINITES ........................ 1st - 100th range ........................ 400th - 500th range
AMNESTIES ........................ 1st - 100th range ........................ 800th - 900th range
DRIPSTONE ........................ 1st - 100th range ........................ 300th - 400th range
TORSIONAL ........................ 1st - 100th range ........................ 400th - 500th range
RESONATOR ........................ 1st - 100th range ........................ 500th - 600th range
DREARIEST ........................ 1st - 100th range ........................ 300th - 400th range
RESTORALS ........................ 1st - 100th range ........................ Outside Top 2000 nines

BANDOLIER ........................ 100th - 200th range ........................ 500th - 600th range
IGNORABLE ........................ 100th - 200th range ........................ 1000th - 1100th range
DEERHOUND ........................ 100th - 200th range ........................ Outside Top 2000 nines
SEALSTONE ........................ 100th - 200th range ........................ 600th - 700th range
LIMESTONE ........................ 100th - 200th range ........................ 400th - 500th range
DEMOTIONS ........................ 100th - 200th range ........................ 700th - 800th range
ADELASTER ........................ 100th - 200th range ........................ 600th - 700th range
SERMONISE ........................ 100th - 200th range ........................ 1100th - 1200th range



Then there are the words of high importance (top 400 nines) on your list that were absent from mine:-

WORD ............................................ ROB's APPROX RANK
OPERATICS ...................................... 106th most important nine
CREMATION ...................................... 156th most important nine
MACONITES ...................................... 191st most important nine
ADEMPTION ...................................... 230th most important nine
PERISOMAL ...................................... 264th most important nine
OSMICATED ...................................... 307th most important nine
PEDICATOR ...................................... 317th most important nine
PROLAMINE ...................................... 328th most important nine
DECIMATOR ...................................... 331st most important nine
METAPORES ...................................... 340th most important nine
PREMIATES ...................................... 341st most important nine
AIRPHONES ...................................... 342nd most important nine
HARMONISE ...................................... 343rd most important nine
PANTIHOSE ...................................... 353rd most important nine
ATROPHIES ...................................... 354th most important nine
BROMINATE ...................................... 384th most important nine
IMPOSTURE ...................................... 385th most important nine
PINAFORES ...................................... 389th most important nine
NORMATIVE ...................................... 392nd most important nine
ABREPTION ...................................... 399th most important nine



From this quick and dirty analysis, I wonder is there something a little askew in the current playability algorithm? It feels like certain words with repeated common letters (e.g. Botanises, Amnesties, Restorals, etc) are more likely to appear in a selection, than words containing more than one of the less common letters, but without any repeated letters (e.g. Operatics, Ademption, Decimator, etc)

What's your take? Do you think there may be something in this?

--------------------------------------------------------------------------------------

TL;DR
Perhaps combinations of repeated common letters (e.g. oo, ee, ss, rr) are more likely to appear in a nine letter word than a combination of two less common letters, but with no repeats in the selection (e.g. mb, ph, cp, cv)?
:arrow: :arrow: :arrow: S:778-ochamp

User avatar
L'oisleatch McGraw
Enthusiast
Posts: 388
Joined: Sun Dec 13, 2015 2:46 am
Location: Waterford
Contact:

Re: Which words are the most useful to know?

Post by L'oisleatch McGraw » Sat Jan 23, 2021 10:42 pm

Doing a little more nerding around this...

I listed 20 (more problematic imo) nine-letter words from both our lists in order of playability, and then checked how often each word had appeared on Countdown, and on Apterous. I then listed 20 nine-letter words in order of playability that were in the Top 100 of both of our lists, as a control sample.
--------------------------------
If a word appeared 4 or more times on CD, and 160 or more times on Apto, it is coloured RED on the lists. (aka best!)
If a word appeared 4 or more times on CD, but 159 or fewer times on Apto, it is coloured GREEN on the lists. (aka next best)
If a word appeared 3 or fewer times on CD, but 160 or more times on Apto, it is coloured BLUE on the lists. (aka 3rd best!)
If a word appeared 3 or fewer times on CD, and 159 or fewer times on Apto, it is coloured BLACK on the lists. (aka worst!)
--------------------------------

First off, 20 common words that are in my Top 400, but absent from Rob's

Image

Next up, 20 common words that are in Rob's top 400, but absent from mine.

Image

Finally, 20 of the most common nine-letter words imaginable! These are Top 100 on both of the lists.

Image


What do these lists tell us?

--------------------------------

My take:-
All the RED words belong in the top 400 most playable 9s.
Some of the black and blue words are new additions to the dictionary and cannot be fairly judged in this context.
Some of the black and blue words don't belong in the top 400.
The words coloured GREEN are generally far more important than the words coloured BLUE, when determining which words to prioritise.
:arrow: :arrow: :arrow: S:778-ochamp

User avatar
Jon O'Neill
Ginger Ninja
Posts: 4361
Joined: Tue Jan 22, 2008 12:45 am
Location: London, UK

Re: Which words are the most useful to know?

Post by Jon O'Neill » Sat Jan 23, 2021 10:49 pm

This is all good stuff.

Splitting hairs but one caveat is that some of the words would not have been in the dictionary for as long as others. While there is no "date entered dictionary" thing on Lexplorer, you could work out appearances/day since the first appearance.

That's not perfect because the volume of games has fluctuated over time. And also newly appearing words will suffer in the Lexplorer count because they wouldn't have been spotted as much early on as people learned the word.

Gavin Chipper
Post-apocalypse
Posts: 10158
Joined: Mon Jan 21, 2008 10:37 pm

Re: Which words are the most useful to know?

Post by Gavin Chipper » Sat Jan 23, 2021 10:57 pm

Wouldn't all maxes appear in the Lexplorer count?

User avatar
Jon O'Neill
Ginger Ninja
Posts: 4361
Joined: Tue Jan 22, 2008 12:45 am
Location: London, UK

Re: Which words are the most useful to know?

Post by Jon O'Neill » Sat Jan 23, 2021 11:08 pm

Yes, but twice if both players get it.

User avatar
L'oisleatch McGraw
Enthusiast
Posts: 388
Joined: Sun Dec 13, 2015 2:46 am
Location: Waterford
Contact:

Re: Which words are the most useful to know?

Post by L'oisleatch McGraw » Sun Jan 24, 2021 3:42 am

Jon O'Neill wrote:
Sat Jan 23, 2021 10:49 pm
Splitting hairs
Everything I've written in the last few days on this thread is about splitting hairs. This is a near perfect resource, and fair play to Rob (and anyone else involved) for making it. Is it perfect? No. Will it ever be perfect? No. Why be an arsehole and point that out? Because I think there is potential to bring it a lot closer to perfection, otherwise there'd be no point in posting nitpicky comments.

I think the main thing I've picked up on in the above analyses is that, while the original list may have been a bit skewed towards favouring double-letter words... the new version seems skewed a bit too much against them, perhaps in particular EE, AA, and OO (based on an observational hunch rather than anything solid).

------------------------------------------------------------

Whilst striving toward perfection, you'd have to wonder about some of the following:

1. Do we have anyone on the inside (i.e. Countdown Team) who has contributed any data (re: shuffling, actual letters distributions, how often the letters distribution is tweaked if ever, etc)?

2. If not, how is the letters distribution data derived? Statistical analysis from the wiki?

3. What would Countdown Team make of all of this? Would they view it favourably, or frown upon it, or not give a crap? If the Apterisation of the Series Finals is something that gets on their tits, then surely a list like this would be similarly annoying, as it further polarises the top Apterites from the rest of the field.

4. How stable is the letters distribution on CD? Will it be tweaked in future? How regular would such tweaks be?

5. How invested is Rob (and whoever helps) in this list? Would a new version be released every time there is a dictionary update, or perhaps a new version once a year... or are there no plans to do future updates?

------------------------------------------------------------

The reason I am so interested in this list all of a sudden, is I'm about to launch an ongoing project to upload videos with falseagrams and stemmers to YouTube. Each video will be titled something along the lines of "Essential 9s: Vol1", "Essential 7s: Vol7" etc, and will contain 100 words per episode... and I was hoping to base the "essentialness" of the words on Rob's playability index.
:arrow: :arrow: :arrow: S:778-ochamp

JackHurst
Series 63 Champion
Posts: 1505
Joined: Tue Jan 20, 2009 8:40 pm
Location: Leics

Re: Which words are the most useful to know?

Post by JackHurst » Sun Jan 24, 2021 11:21 am

L'oisleatch McGraw wrote:
Sun Jan 24, 2021 3:42 am
1. Do we have anyone on the inside (i.e. Countdown Team) who has contributed any data (re: shuffling, actual letters distributions, how often the letters distribution is tweaked if ever, etc)?
Graeme posted an interesting thread a while back (which I can't find) where he did a statistical anaylsis on how likely it is that the letters tiles are just shuffled face down like a deck of cards with no tweaks. He found that there was something like a 0.01% change that no face up tweaking was happening.

IIRC members of the Countdown Team posted to in that thread assure that nothing likeface up tweaking was happening as you can see here.
I also did some of my own simulations a few months ago and posted the findings which you may find interesting

L'oisleatch McGraw wrote:
Sun Jan 24, 2021 3:42 am
3. What would Countdown Team make of all of this? Would they view it favourably, or frown upon it, or not give a crap? If the Apterisation of the Series Finals is something that gets on their tits, then surely a list like this would be similarly annoying, as it further polarises the top Apterites from the rest of the field.
I can't speak for them, but Countdown is an (albeit quite niche) competitive intellectual pursuit. Just like Chess, Scrabble, Maths Olympiads, Quizzes etc there will always be people who take it seriously enough to go the extra mile and beat the competition by practising better than anybody else.
- Before apterous it was people like Julian Fell who were probably absolutely hammering their hand held electronic copy of the game to get in loads more practise than you could by watching the show, and then studying the paper dictionary to pick out high probability words
- Then when apterous came out you had Kirk Bevins who was there at the birth and was the first person to really put in the hours on apterous (and anahack) to get himself head and shoulders above the rest.
- Now you have players who study from probability based lists, and who do their own coding to help themselves pick out the stuff they want to be studying. Rob (best letters player at the moment) is an example of somebody who does that, and he is notable better than the next best on apterous (who I assume are people who fall more into the Bevins breed of smashing apterous hard)
L'oisleatch McGraw wrote:
Sun Jan 24, 2021 3:42 am
5. How invested is Rob (and whoever helps) in this list? Would a new version be released every time there is a dictionary update, or perhaps a new version once a year... or are there no plans to do future updates?
Yes, I believe Rob does update it when there are new releases.

JackHurst
Series 63 Champion
Posts: 1505
Joined: Tue Jan 20, 2009 8:40 pm
Location: Leics

Re: Which words are the most useful to know?

Post by JackHurst » Sun Jan 24, 2021 12:19 pm

L'oisleatch McGraw wrote:
Sun Jan 24, 2021 3:42 am
Everything I've written in the last few days on this thread is about splitting hairs. This is a near perfect resource, and fair play to Rob (and anyone else involved) for making it. Is it perfect? No. Will it ever be perfect? No. Why be an arsehole and point that out? Because I think there is potential to bring it a lot closer to perfection, otherwise there'd be no point in posting nitpicky comments.
1) You're not being an arsehole
2) You are right that the list probably has room for improvements. However, I think those improvements would be very very marginal, and probably only make a difference for players at Rob's level. Best thing to do if you want to get better is just gobble up the list starting from the top.

Sam Cappleman-Lynes
Acolyte
Posts: 123
Joined: Sun Apr 07, 2013 11:30 pm

Re: Which words are the most useful to know?

Post by Sam Cappleman-Lynes » Sun Jan 24, 2021 12:39 pm

Just a quick note from me to say that probabilities are really weird and unintuitive, and in fact the lack of repeated letters at the top of Rob's list is entirely expected from a statistical point of view, assuming no shenanigans in the shuffling (which is probably a dud assumption but never mind).

A single instance of a repeated letter in a selection, all other things being equal, roughly halves (!) the probability of that selection appearing.

Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Re: Which words are the most useful to know?

Post by Robert Foster » Sun Jan 24, 2021 4:16 pm

I would always encourage any suggestions as to how to make my list more accurate! So thanks for taking the time to make these insights.

My general philosophy was to make a list that's as objectively correct as possible, with no artificial tweaks applied after looking at the list to reinforce what 'should' be at the top according to people's preconceptions. You can't get a lot more generalised than 'look at a lot of (75 million) Countdown rounds and count how many times the words come up' so this is the basis of the approach I used.

Using this method, there are (I think) only three things that could significantly change the result of the simulation:
- The letters distribution. I borrowed the distribution that Graeme suggested on the Ask Graeme thread, which he arrived at from looking at frequency counts of letters over a recent series. Here it is FWIW: [B:2, C:3, D:6, F:2, G:4, H:2, J:1, K:1, L:5, M:4, N:8, P:4, Q:1, R:9, S:9, T:9, V:2, W:2, X:1, Y:1, Z:1] [A:15, E:20, I:13, O:13, U:7]. While the CD letters distribution does appear to fluctuate slightly over time, it's nothing major.
- The weightings of letter picks. I designed my list to be customisable in this respect because picking impacts word usefulness enormously; there's no point in learning a lot of 5V nines if you and your opponent barely ever pick it, and vice versa. The default is [5V:10%, 4V:60%, 3V:30%] (although your doc might be working from an older version which used equal weightings for all three vowel picks?). I took this ratio from apterous picking data.
- The shuffling algorithm. As Jack mentioned, the show (and apterous, almost certainly) takes steps to ensure that the same letters are less likely to appear consecutively in the deck. The method I used to emulate this is: "draw a letter, if it's identical to the last letter, discard it and draw again. Keep whatever this new letter is, even if it's identical again". Shuffling is the aspect which I have looked into the least though. If Charlie would be willing to share the exact letters distribution used on apterous or the shuffling algorithm with me, I'd happily re-run my simulations with these parameters. I imagine this is apterous secret sauce though. I think my method is a good approximation otherwise.

Answers to a couple of other questions that cropped up:
- It's just me that works on this
- I update the lexicon and run a new simulation whenever apterous publishes its dictionary changes.

Gavin Chipper
Post-apocalypse
Posts: 10158
Joined: Mon Jan 21, 2008 10:37 pm

Re: Which words are the most useful to know?

Post by Gavin Chipper » Sun Jan 24, 2021 4:51 pm

Sam Cappleman-Lynes wrote:
Sun Jan 24, 2021 12:39 pm
Just a quick note from me to say that probabilities are really weird and unintuitive, and in fact the lack of repeated letters at the top of Rob's list is entirely expected from a statistical point of view, assuming no shenanigans in the shuffling (which is probably a dud assumption but never mind).

A single instance of a repeated letter in a selection, all other things being equal, roughly halves (!) the probability of that selection appearing.
The shuffling shenanigans seem to be geared towards making it even less likely you'll get repeated letters. So while it may be a dud assumption that there are no shenanigans, the shenanigans push things further in the direction you're talking about.

Jonathan Willis
Newbie
Posts: 19
Joined: Tue Jan 12, 2021 6:02 pm

Re: Which words are the most useful to know?

Post by Jonathan Willis » Sun Jan 24, 2021 5:05 pm

Anyone knows how the letters are shuffled on Countdown?, surely they don't just pick random letters up or we would end up with 3 Zs 2 Qu etc!

User avatar
L'oisleatch McGraw
Enthusiast
Posts: 388
Joined: Sun Dec 13, 2015 2:46 am
Location: Waterford
Contact:

Re: Which words are the most useful to know?

Post by L'oisleatch McGraw » Sun Jan 24, 2021 5:44 pm

Robert Foster wrote:
Sun Jan 24, 2021 4:16 pm
- It's just me that works on this
- I update the lexicon and run a new simulation whenever Apterous publishes its dictionary changes.
Well, the replies that have come in since, have me convinced that it is currently as near perfect as it needs to be... So I can put to bed any lingering reservations and power ahead with the YouTube project with full gusto.

To conclude, you are a fucking superstar imo. :)
Thanks for the resource and for being so willing to hear opinions, make tweaks etc.

------------------------------------------------------------

[Edited to answer this question: "although your doc might be working from an older version which used equal weightings for all three vowel picks?". No, when you mentioned there was a new version, I got the latest (V4?) before making the 3 lists above.]
:arrow: :arrow: :arrow: S:778-ochamp

Sam Cappleman-Lynes
Acolyte
Posts: 123
Joined: Sun Apr 07, 2013 11:30 pm

Re: Which words are the most useful to know?

Post by Sam Cappleman-Lynes » Sun Jan 24, 2021 7:38 pm

Robert Foster wrote:
Sun Jan 24, 2021 4:16 pm
While the CD letters distribution does appear to fluctuate slightly over time, it's nothing major.
I had a game in which 5 C's were drawn, and I think someone has also mentioned to me a game with 6 C's, which is double what you use in your simulations!

For my own lists I wrote code that would scrape wiki data for the latest series and come up with a letters distruibution based on that, but I must have just written that as a one-off throwaway because I can't seem to find it any more. Otherwise, I would have offered to share it.

User avatar
Charlie Reams
Site Admin
Posts: 9473
Joined: Fri Jan 11, 2008 2:33 pm
Location: Cambridge
Contact:

Re: Which words are the most useful to know?

Post by Charlie Reams » Sun Jan 24, 2021 7:57 pm

Robert Foster wrote:
Sun Jan 24, 2021 4:16 pm
- The shuffling algorithm. As Jack mentioned, the show (and apterous, almost certainly) takes steps to ensure that the same letters are less likely to appear consecutively in the deck. The method I used to emulate this is: "draw a letter, if it's identical to the last letter, discard it and draw again. Keep whatever this new letter is, even if it's identical again". Shuffling is the aspect which I have looked into the least though. If Charlie would be willing to share the exact letters distribution used on apterous or the shuffling algorithm with me, I'd happily re-run my simulations with these parameters. I imagine this is apterous secret sauce though. I think my method is a good approximation otherwise.
The apterous algorithm is not a huge trade secret, but it is intended to shuffle a deck for use in a full game rather than a particular round, so certain things would shake out differently. (For example, apterous builds the deck for the entire game in one pass so duplicate aversion would span rounds, e.g. if the last vowel of one round is A, it's slightly less likely that the first vowel of the next round is A.) It's definitely less stringent at avoiding doubles and trebles than your approach and has a bunch of other tweaks, but my approach is also a best-guess so it's perfectly possible that yours is a better simulation of Countdown itself.

If you're interested, I'd be happy to give you a big file of all the selections from the last N years (since whenever I last changed the algorithm) and you could use those as raw data, rather than trying to simulate all the vagaries of letter picking, de-dupe etc.

Do you simulate the full 11 rounds every time? This is probably a very marginal effect, but one of the consequences of active de-duplication is that you will tend to push common letters to later in the game (once the dupes start to "stack up" at the end of the deck and become harder to avoid), which shifts the distribution slightly.

Robert Foster
Rookie
Posts: 42
Joined: Sun Apr 17, 2011 1:42 pm
Location: Bletchley

Re: Which words are the most useful to know?

Post by Robert Foster » Sun Jan 24, 2021 9:07 pm

My sim generates every round in isolation - the consonant and vowel decks are reshuffled after every round, and duplicate aversion doesn't span rounds. So it's possible for a Q to come up in several consecutive rounds, unlike TV Countdown, but the 'stacking up' effect that you mention is avoided.

Would love to run some apterous rounds through my sim - I've sent you an aptomail!

JackHurst
Series 63 Champion
Posts: 1505
Joined: Tue Jan 20, 2009 8:40 pm
Location: Leics

Re: Which words are the most useful to know?

Post by JackHurst » Sun Jan 24, 2021 9:36 pm

Robert Foster wrote:
Sun Jan 24, 2021 9:07 pm
My sim generates every round in isolation - the consonant and vowel decks are reshuffled after every round, and duplicate aversion doesn't span rounds. So it's possible for a Q to come up in several consecutive rounds, unlike TV Countdown, but the 'stacking up' effect that you mention is avoided.

Would love to run some apterous rounds through my sim - I've sent you an aptomail!
Would be super interested to see if the simulator generates a significantly different word ordering when letters are generated more like Charlie describes

JackHurst
Series 63 Champion
Posts: 1505
Joined: Tue Jan 20, 2009 8:40 pm
Location: Leics

Re: Which words are the most useful to know?

Post by JackHurst » Sun Jan 24, 2021 9:37 pm

Also Rob, how many rounds do you simulate and how long does it take to run.

Post Reply

Who is online

Users browsing this forum: No registered users and 12 guests