Calling statisticians
Moderator: Jon O'Neill
- Charlie Reams
- Site Admin
- Posts: 9494
- Joined: Fri Jan 11, 2008 2:33 pm
- Location: Cambridge
- Contact:
Calling statisticians
Do any of our resident statisticians have a smart guess for what sort of distribution this data might be drawn from? I've clipped off the long tail but it approaches zero pretty steadily.
Re: Calling statisticians
Log-normal maybe?
They kind of look similar. I'm not too strong on stats so that's about the most insight I can offer.
They kind of look similar. I'm not too strong on stats so that's about the most insight I can offer.
- Ben Wilson
- Legend
- Posts: 4545
- Joined: Fri Jan 11, 2008 5:05 pm
- Location: North Hykeham
Re: Calling statisticians
Does kinda looked like a skewed normal to me too, but my stats are so rusty it's unreal.
- Charlie Reams
- Site Admin
- Posts: 9494
- Joined: Fri Jan 11, 2008 2:33 pm
- Location: Cambridge
- Contact:
Re: Calling statisticians
Log-normal seems very plausible based on the source. It's the data on how long it takes people to solve conundrums on Apterous, if you're interested. I'm doing something interesting with this data which I'll share at some point.
Re: Calling statisticians
Just had an idea that it might be an Erlang distribution, but you'd expect that to have a flatter peak given the length of the tail, and I can't see any reason that conundrum times would generate Erlang data now that's been revealed as the source.
- Charlie Reams
- Site Admin
- Posts: 9494
- Joined: Fri Jan 11, 2008 2:33 pm
- Location: Cambridge
- Contact:
Re: Calling statisticians
It does look a bit Erlangy (in fact now you've said that I realise that's what was making it look familiar in the first place) but I know that human reaction times are distributed log-normal so it seems possible that other brain activities would be similar. I'll do some tests and find out.
- Kai Laddiman
- Fanatic
- Posts: 2314
- Joined: Wed Oct 15, 2008 3:37 pm
- Location: My bedroom
Re: Calling statisticians
My ranking on Apterous before and after I cheated?
16/10/2007 - Episode 4460
Dinos Sfyris 76 - 78 Dorian Lidell
Proof that even idiots can get well and truly mainwheeled.
Dinos Sfyris 76 - 78 Dorian Lidell
Proof that even idiots can get well and truly mainwheeled.
- Frank Rodolf
- Rookie
- Posts: 74
- Joined: Sun Nov 02, 2008 3:22 pm
- Location: Eindhoven, the Netherlands
Re: Calling statisticians
And today's Daily Duel was one of those tests?Charlie Reams wrote:It does look a bit Erlangy (in fact now you've said that I realise that's what was making it look familiar in the first place) but I know that human reaction times are distributed log-normal so it seems possible that other brain activities would be similar. I'll do some tests and find out.
Frank
- Kirk Bevins
- God
- Posts: 4923
- Joined: Mon Jan 21, 2008 5:18 pm
- Location: York, UK
Re: Calling statisticians
This is the best off topic thread yet - love the curiousity of Charlie and love the responses.
-
- Post-apocalypse
- Posts: 13271
- Joined: Mon Jan 21, 2008 10:37 pm
Re: Calling statisticians
Would that work? Without any competition from any opposition, people are more likely to check and double check their answers. Unless Charlie has done that thing that was talked about where only the fastest gets the points. I'll do the duel now...Frank Rodolf wrote:And today's Daily Duel was one of those tests?Charlie Reams wrote:It does look a bit Erlangy (in fact now you've said that I realise that's what was making it look familiar in the first place) but I know that human reaction times are distributed log-normal so it seems possible that other brain activities would be similar. I'll do some tests and find out.
- Charlie Reams
- Site Admin
- Posts: 9494
- Joined: Fri Jan 11, 2008 2:33 pm
- Location: Cambridge
- Contact:
Re: Calling statisticians
You overestimate my organisation. That duel was lined up ages ago. I just meant statistical tests on the existing data.Frank Rodolf wrote:And today's Daily Duel was one of those tests?Charlie Reams wrote:It does look a bit Erlangy (in fact now you've said that I realise that's what was making it look familiar in the first place) but I know that human reaction times are distributed log-normal so it seems possible that other brain activities would be similar. I'll do some tests and find out.
-
- Kiloposter
- Posts: 1955
- Joined: Mon Jan 21, 2008 9:02 am
- Location: UK
Re: Calling statisticians
It has a vague likeness to a Poisson Distribution with a mean of around 3 to 5, though it doesn't tail off quite quick enough. See the mean 4 example here.
- Michael Wallace
- Racoonteur
- Posts: 5458
- Joined: Mon Jan 21, 2008 5:01 am
- Location: London
Re: Calling statisticians
My first thought was a gamma, but log-normal looks about right too (depending on the parameters, obviously). If I wasn't in the middle of playing computer games I might think about the actual problem to try and decide which distributions are most appropriate.
Also, these data, not this data, n00b
Also, these data, not this data, n00b
- Charlie Reams
- Site Admin
- Posts: 9494
- Joined: Fri Jan 11, 2008 2:33 pm
- Location: Cambridge
- Contact:
Re: Calling statisticians
Log-normal fits the data fairly well, but I'm still open to better suggestions. If anyone wants the raw data to play with then let me know.Michael Wallace wrote:My first thought was a gamma, but log-normal looks about right too (depending on the parameters, obviously). If I wasn't in the middle of playing computer games I might think about the actual problem to try and decide which distributions are most appropriate.
I'll start saying "these data" when you start saying "one panino please".Michael Wallace wrote:Also, these data, not this data, n00b
- Michael Wallace
- Racoonteur
- Posts: 5458
- Joined: Mon Jan 21, 2008 5:01 am
- Location: London
Re: Calling statisticians
The wife and I make a point of saying pannino, not pannini, so nyer.Charlie Reams wrote:I'll start saying "these data" when you start saying "one panino please".
(not that I can remember ever asking for a pannino (or pannini))
- Kirk Bevins
- God
- Posts: 4923
- Joined: Mon Jan 21, 2008 5:18 pm
- Location: York, UK
Re: Calling statisticians
Please try and spell them correctly. I always ask "do you do panini?" which sounds a bit odd and they then say "yes, we have bacon paninis, or cheese paninis". "I'll have a bacon panino please". I then had one woman say "sorry?" and I just said "a bacon one please" out of semi-embarrassment. Why should I get embarrassed by being correct?Michael Wallace wrote:The wife and I make a point of saying pannino, not pannini, so nyer.Charlie Reams wrote:I'll start saying "these data" when you start saying "one panino please".
(not that I can remember ever asking for a pannino (or pannini))
- Michael Wallace
- Racoonteur
- Posts: 5458
- Joined: Mon Jan 21, 2008 5:01 am
- Location: London
Re: Calling statisticians
Please try and spell them correctly.[/quote]Kirk Bevins wrote:The wife and I make a point of saying pannino, not pannini, so nyer.
(not that I can remember ever asking for a pannino (or pannini))
Weird - I thought it was panini and the wife corrected me, and then I (somehow) thought that the forum spellchecker agreed with him, but clearly my eye was playing tricks on me.
Basically it wasn't my fault >_>
- Ben Hunter
- Kiloposter
- Posts: 1770
- Joined: Wed Oct 15, 2008 2:54 pm
- Location: S Yorks
Re: Calling statisticians
Correctness is a matter of context when it comes to language, though I'll probably use 'panino' in future, purely as a pretext for charming banter with attractive sandwich shop girls.Kirk Bevins wrote:Please try and spell them correctly. I always ask "do you do panini?" which sounds a bit odd and they then say "yes, we have bacon paninis, or cheese paninis". "I'll have a bacon panino please". I then had one woman say "sorry?" and I just said "a bacon one please" out of semi-embarrassment. Why should I get embarrassed by being correct?Michael Wallace wrote:The wife and I make a point of saying pannino, not pannini, so nyer.Charlie Reams wrote:I'll start saying "these data" when you start saying "one panino please".
(not that I can remember ever asking for a pannino (or pannini))
- Michael Wallace
- Racoonteur
- Posts: 5458
- Joined: Mon Jan 21, 2008 5:01 am
- Location: London
Re: Calling statisticians
I don't know about anyone else, but I for one am certainly interested to find out whether your panino exploits get you anywhere...Ben Hunter wrote:Correctness is a matter of context when it comes to language, though I'll probably use 'panino' in future, purely as a pretext for charming banter with attractive sandwich shop girls.
Re: Calling statisticians
It looks like my pyjama bottoms in the morning
- Charlie Reams
- Site Admin
- Posts: 9494
- Joined: Fri Jan 11, 2008 2:33 pm
- Location: Cambridge
- Contact:
Re: Calling statisticians
I actually did this last time I was in Clowns, a cafe in Cambridge which is run by Italians. The ASSG (attractive sandwich shop girl) said "ohh, very good Italian" and smiled at me. It wasn't quite the full sex I was expecting, but still rewarding.Ben Hunter wrote: Correctness is a matter of context when it comes to language, though I'll probably use 'panino' in future, purely as a pretext for charming banter with attractive sandwich shop girls.
- Michael Wallace
- Racoonteur
- Posts: 5458
- Joined: Mon Jan 21, 2008 5:01 am
- Location: London
Re: Calling statisticians
So I was thinking about this on the tube this morning. My main thoughts were about what factors are going to affect solving time, and then once you have these you can try and fit a model.
The two most obvious ones are player ability and conundrum difficulty. The first is easy to factor into our model, thanks to ratings (give or take the various problems with the system), the second one less so. I don't know how many conundrums have been given in multiple games, but that's one option for trying to assess their difficulty. Another might be some statistic for each conundrum on how often the word is used in English (although that's probably not easily available).
There are obviously going to be heaps of other things that influence the solving time, such as whether it's crucial (I would imagine people might be trying less hard if they've already won), or if the conundrum is needed to make a game a particularly good score. I doubt the second has much of an influence, and I'm not really convinced the first would either. There are probably other factors too, though.
But yeah, I'd start with data on the first two, assuming there's some extra information available to assess the conundrum difficulty, and then stick them into a model, maybe Time ~ Gamma(a,b) where a and b are functions of those factors. More interesting though would probably be using these data to get an assessment of the difficulty of conundrums, which is probably easier to do anyway.
The two most obvious ones are player ability and conundrum difficulty. The first is easy to factor into our model, thanks to ratings (give or take the various problems with the system), the second one less so. I don't know how many conundrums have been given in multiple games, but that's one option for trying to assess their difficulty. Another might be some statistic for each conundrum on how often the word is used in English (although that's probably not easily available).
There are obviously going to be heaps of other things that influence the solving time, such as whether it's crucial (I would imagine people might be trying less hard if they've already won), or if the conundrum is needed to make a game a particularly good score. I doubt the second has much of an influence, and I'm not really convinced the first would either. There are probably other factors too, though.
But yeah, I'd start with data on the first two, assuming there's some extra information available to assess the conundrum difficulty, and then stick them into a model, maybe Time ~ Gamma(a,b) where a and b are functions of those factors. More interesting though would probably be using these data to get an assessment of the difficulty of conundrums, which is probably easier to do anyway.
- Charlie Reams
- Site Admin
- Posts: 9494
- Joined: Fri Jan 11, 2008 2:33 pm
- Location: Cambridge
- Contact:
Re: Calling statisticians
That's exactly what I'm doing, although it's harder than it sounds because, with over 8000 conundrums, the data for any given conundrum is pretty sparse. There are some other complications too, which I'll share when I write up the results some time next week.Michael Wallace wrote: But yeah, I'd start with data on the first two, assuming there's some extra information available to assess the conundrum difficulty, and then stick them into a model, maybe Time ~ Gamma(a,b) where a and b are functions of those factors. More interesting though would probably be using these data to get an assessment of the difficulty of conundrums, which is probably easier to do anyway.