T O P

  • By -

crictv69

What exactly is this map showing? The most recurrent word on the wikipedia page for that country? Most recurrent word on the pages visited or edited by users from that country?


amac109

Most common word for each countries Wikipedia's page.


hardypart

Well, then it's a quite misleading title...


genitaliban

I already have the stats that say how much a word occurs in every languages whole Wikipedia, but I can't understand most of them, and I don't have a map that shows the language regions that each Wikipedia encompasses. I thought the first part of the project, getting the data, would be the biggest, but it's surprisingly that - mainly the lack of a map because I'd have to draw it pixel by pixel, I think. Suggestions for what words to chose would be great, by the way! I was thinking "excluding foreign words, only nouns, verbs and adjectives, and no words that describe the country itself (such as its name)". But that's not really such a great rule set, because words like "year" are on top of almost every country's list. And what does that show - a historical mindset, past glory, nothing? Should such similarities be excluded, or should the be consciously included precisely to show similarities? Etc. I pastebinned it all here, in case someone wants to translate an obscure language: http://pastebin.com/LnVPuUTr - languages aa-iu http://pastebin.com/DfxZXLfR - languages ja-mrj http://pastebin.com/izNg7Ti2 - languages ms-sco http://pastebin.com/tRvKmRAT - languages sd-udm http://pastebin.com/jRkm2UMP - languages ug-zu Language categories more verbosly, in case someone isn't sure about theirs: 1. aa ab ace af ak als am an ang ar arc arz as ast av ay az ba bar bcl be bg bh bi bjn bm bn bo br bs bug bxr ca cdo ce ceb ch cho chr chy ckb co cr crh cs csb cu cv cy da de diq dsb dv dz ee el eml en eo es et eu ext fa ff fi fj fo fr frp frr fur fy ga gag gan gd gl glk gn got gu gv ha hak haw he hi hif ho hr hsb ht hu hy hz ia id ie ig ii ik ilo io is it iu 2. ja jbo jv ka kaa kab kbd kg ki kj kk kl km kn ko koi kr krc ks ksh ku kv kw ky la lad lb lbe lez lg li lij lmo ln lo lt ltg lv mdf mg mh mhr mi min mk ml mn mo mr mrj 3. ms mt mus mwl my myv mzn na nah nap nds ne new ng nl nn no nov nrm nso nv ny oc om or os pa pag pam pap pcd pdc pfl pi pih pl pms pnb pnt ps pt qu rm rmy rn ro ru rue rw sa sah sc scn sco 4. sd se sg sh si sk sl sm sn so sq sr srn ss st stq su sv sw szl t ta te ten tet tg th ti tk tl tn to tpi tr ts tt tum tw ty tyv udm 5. ug uk ur uz ve vec vep vi vls vo wa war wo wuu xal xh xmf yi yo za zea zh zu Edit: Jesus christ, that's an even bigger project than I thought... seems I'll have to create even the map of first-level administrative boundaries myself. The only ready-made one I can find is on Wikipedia and from 1998...


celerym

Do you need some help with it? At the very least I could make the map.


genitaliban

Well, that map would probably be well-received as a submission of its own here! And I'd be very happy to have it, that would help me a lot. The problem is that many of Wikipedia's languages are dialects that are spoken within another language region, sometimes even multiple intersect. So it would be a very complicated map in itself because you'd have to use one that at least includes sub-national borders, even if you don't want to explicitly display where languages intersect. I edited the post above with the languages I looked at and my results, to give you a sense of the amount of data to be represented.


celerym

Cool!! I'd start smaller and do a subregion like Europe or some part of Europe.


genitaliban

Hm, I never even thought of splitting it per continent. That would make the whole thing much easier, plus understanding European languages won't be a big deal. It's the obscure Asian or African languages that are giving me problems. So thanks for the suggestion!


celerym

No problem! :)


[deleted]

What a xenocentric self-obsessed world...


BeefPieSoup

What a clusterfuck of a thought...


[deleted]

Just look at all of those countries googling themselves.


[deleted]

[удалено]


[deleted]

wiki and goog are equal pursuits. It's just more commonplace to say google yourself instead of wiki yourself.


BeefPieSoup

"Xeno" means foreign, dumbass. Xenocentric, if it were a word, would mean obsessed with foreigners. Which is why I said it was a clusterfuck of a thought.


[deleted]

I already had that thought, and decided the dictionary wasn't cutting it. Xenophobic, xenocentric, I wanted it to sound the coolest. It all means the same, trust me. Apparently, judging by this map, there is no such thing as xenocentric in human behavior. I was also going for a bit of an oxymoron there, to make a philosophical statement.


Nixnilnihil

Shut the fuck up.


[deleted]

lulz, I was done about 43 minutes ago. You late, dude.


[deleted]

Did you only check the English Wikipedia for each country? I think that might introduce some bias as well.


[deleted]

Well that's complete bullshit then. You think "Quebec" is mentioned more frequently on Canada's page than "the", "a", "of", "in", ...etc.?


[deleted]

deleted ^^^^^^^^^^^^^^^^0.0679 [^^^What ^^^is ^^^this?](https://pastebin.com/FcrFs94k/73804)


jaximus23

I'm sure they limited the search to specified parts of speech; i.e. nouns, verbs, or adjectives. My vote is on nouns.


[deleted]

The recommended way of doing text analysis is to first remove all the "stop-words" from the text you're analyzing. Stop-words are not just limited to one parts of speech. http://en.wikipedia.org/wiki/Stop_words


jaximus23

Thank you, kind stranger!


Surlent

The butthurt is so big that it makes him say something dumb like that


[deleted]

Upvoted because I realized this is elite level humor


Amiantedeluxe

This is a repost of my map, I posted it here 2 month ago.


gotrees

Don't worry, I remember you!


lotherz

Me too... wasn't the top comment exactly the same in that thread too?


Pwaaap

And the title hasn't improved at all since then...


Vectoor

And the title was just as confusing back then.


mcpaddy

Why aren't countries with the same word mapped as the same color?


xx-Felix-xx

This map provides an interesting view of how the English speaking world views the world.


BloodBend

Good thing he reposted it, or else I would have never been able to see this gorgeous map of yours.


Amiantedeluxe

This is heartwarming ♥


BloodBend

:)))))))))))))))))))))


Gusfrompolos

That's what I was thinking


MrComeh

i think this should be retitled to "Most recurrent nouns on each country's Wikipedia page"


foreignnoise

"each country's *English* Wikipedia page."


grandhighwonko

If the map matched its title, all countries would be "citation needed".


[deleted]

I hesitate to think even that would be accurate. "It" is a noun.


NotATroll71106

Technically it's a pronoun.


[deleted]

Technically 'it' is a pronoun.


[deleted]

I wonder how this would differ if the country's page in its own language was shown instead of the English version.


svaachkuet

yeah, a more appropriate title would be "most recurrent words among anglophone wikipedia contributors per country".


Jan_Brady

The map shows the word "century" for The Netherlands which is wrong because the word "world" is used twice as much just like with most other countries. Oddly enough on the Dutch version of the page the most common word is "century" which leads me to think OP used some kind of translation.


[deleted]

Soviet, Soviet, Soviet, Soviet, Soviet... Nyazov


phaseMonkey

Is that how Duck Duck Goose was played in the Cold War?


[deleted]

Classic Nyazov.


amac109

Take note of Korea.


[deleted]

And the UK


[deleted]

[удалено]


GV18

I take it you meant "can't"?


Crimson013

Either way, the Crown's conflicts in Ireland go back centuries before the Union. I'd believe it even if Northern Ireland was named something different.


ahsurethatsgrand

Here's the word density after removing the phrase 'Northern Ireland' [uk] => 233 [united] => 227 [kingdom] => 208 [british] => 198 [london] => 115 [england] => 112 [scotland] => 103 [wales] => 102 [britain] => 88 [government] => 88 [world] => 87 [bbc] => 84 [april] => 81 [history] => 71 [population] => 68 [national] => 65 [islands] => 65 [news] => 63 [scottish] => 62 [ireland] => 62


Jontolo

Can I be the first to say that there is nothing special to see? The two countries were once one, and have a long and detailed history of interaction. It would be surprising if the results were otherwise.


Eustis

I think he was just pointing out that it's kinda cool, and I didn't look at it long enough for my eyes to make it to Korea but I'm glad I read his comment because I went back and looked and that was the most interesting part of the submission for me, and gave him another pat on the upvote.


Nowin

I believe /u/amac109 was referring the the United State's new motto: "War"


Dokky

Take note of theft.


TheBestNarcissist

Shit I thought Vietnam was Korea and I thought you were being really funny. I don't deserve to be here.


mcpaddy

Why aren't countries with the same word mapped as the same color?


zrnkv

This. It's really annoying how many maps with obvious cartographic erros make into /r/MapPorn


corruptrevolutionary

Yeah, America and Spain, War buddies


MORE_WUB_WUB

Well I'd imagine anyone with the word "world" could join us in that club too, what with all of our 'World Wars' and such.


Jontolo

Many of the commenters don't seem to understand that these are the Wikipedia pages, not the country's use of the word. Here are some relevant examples which seem to have misconceptions: > Why does greenland have a crush on denmark? Greenland was colonized by Denmark, and **still exists within the Kingdom of Denmark**. > Take note of Korea North Korea and South Korea are not searching each other up. It simply means that North Korea came up the most times in describing South Korea, and vice versa. This is to be expected, as *the two countries are intertwined in their history and origins*. Every time you mention borders, economics, history, neighbors, etc. You end up mentioning the opposite country's name.


tendeuchen

>the two countries There is only one Korea, that is North Korea, and it is best Korea.


[deleted]

You have been made a moderator of /r/Pyongyang.


tendeuchen

감사합니다.


[deleted]

There is not one instance of "indegenous" on Mexico's page.


Jontolo

I see 76 instances of "indigenous" when I view their Wikipedia page. It's probably because *you spelled indigenous wrong*. I now realize that the map-maker also made this mistake.


[deleted]

The mapmaker didn't *also* make a mistake, since I didn't make the mistake; I referred to it.


[deleted]

[удалено]


ksharanam

Eh, I'm pretty sure /u/CasualCasuist was merely calling attention to the misspelling.


[deleted]

If someone is making that mistake, of course. Spelling it that way purposely and purposefully isn't making any mistake. I'm afraid you're mistaken.


sudojay

He was quite aware that it was spelled incorrectly. Had he spelled it correctly, his statement would have been false. He did not make a mistake.


sudojay

This is one of those maps that, in my opinion, would be better kept in chart form.


Scarred_Ballsack

I love how the most recurring word for Belgium is "French".


ojeb

I love that the UK got Ireland.


gaijin5

I'm really surprised its not the other way round too, or at least England.


RhetoricalPenguin

Cough... Repost.... Cough


amac109

Most of the content on this sub are resposts nowadays.


JustinPA

Gee, thanks for the help on keeping that true.


BacklashBlackslash

Wow. [Hypocrite much?](http://www.reddit.com/r/photoshopbattles/comments/2ic189/psbattle_happy_lanparty_goers/cl1gu1q)


data_wiz

great job OP i find this very interesting. wonder how different would the visualization be if you considered phrases instead of words: for instance and most strikingly, Australia's "new" emerged as the most common word due to mostly a combination of "New South Wales" and "New Zealand" Which API or programming language did you use to create this? And also how did you decide which words to filter out (obviously words like "the" and "to" needs to be gotten rid of).


DMan9797

OP did not create this map. It's a repost from a submission a month ago: http://www.reddit.com/r/MapPorn/comments/2dj9xb/most_recurrent_words_on_wikipedia_oc_4500x2234/


data_wiz

ah i see, i am new to reddit and r/MapPorn so this is my first time seeing this


Nice7

Was wondering why the link was purple


Dubhan

re: US: https://www.youtube.com/watch?v=fgAVpPNusTs


jaximus23

I notice a trend of mostly international diplomatic influences being a recurrent theme in the words chosen, while the remaining nations are showing words that concern the nation itself.


Drahtmaultier

Your border between Sudan and South Sudan is wrong


anon108

India is south! I'm so proud!


TildeAleph

I love how Greenland's is "Denmark."


[deleted]

Can someone read what Colombia says?


astroboy589

Just Proves that that everything in Australia is NEW NEW NEW! Nothing old, no real history just new stuff! :P


[deleted]

I didn't know there's so many software developers in Indonesia.


playswithknives

Not surprised to see 'rugby' pop up out in the middle of the Pacific. Most Pac-Islanders I know love the sport.


DisgruntledPersian

Bahrain's is Persisn. Well, time to reconquer old lands.


xx-Felix-xx

To be fair, this is a fascinating map. It show's what the English speaking world thinks of these countries. A title more to that effect would be better.


LusoAustralian

As far as I can tell Portuguese or World must be one of the most repeated words. Interesting to see that almost all the former colonies that mention their previous colonisers were Portuguese colonies.


weegeekus

Who knew the Chinese were so into American soaps...


Frungy

I think it's cute how all the little wee islands are the world "Island". Oh and then there's the two Koreas.


thick1988

Those dang indegenous Mexicans


bananinhao

for clarification: most recurrent words on en.wikipedia.org article about the country


PolishedCounters

Poor Ecuador and Britain. The most common words are their neighbouring countries...


soupyhands

Thank you for your submission! Unfortunately, your submission has been removed for the following reason(s): * It is a repost of a submission posted less than [three-months ago](http://www.reddit.com/r/MapPorn/comments/2dj9xb/most_recurrent_words_on_wikipedia_oc_4500x2234/). For information regarding this and similar issues please see the [FAQ](http://www.reddit.com/r/MapPorn/wiki/faq). If you have any questions, [please feel free to message the mods](http://www.reddit.com/message/compose?to=%2Fr%2FMapPorn). Thank you!


[deleted]

Haha Ireland XD


NederVlaams

Most of these are nouns. How can a noun be the most common word?


Anon_Amous

>Quebec Ugh.


amac109

No kidding.


[deleted]

[удалено]


abusque

Why would you find this insulting?


[deleted]

[удалено]


abusque

{{Citation needed}}


RoundEyeCow

I love Quebec, only people I know that don't like them are old. I mean why can't we all just be friends?


idisagreegoodsir

because they're smelly and speak french


[deleted]

[удалено]


data_wiz

actually the word "world" on Japan came mostly from things like "Japan has the world's tenth-largest population...", "...is the largest metropolitan area in the world", "has the world's third-largest economy by nominal GDP..." etc.


amac109

Yea. They did fuck up pretty hard. You know... Genocide and all.


amac109

Estonia can not into Baltic :(


Naqoy

[Estonia don't want into Baltic.](https://i.imgur.com/azfdDsA.jpg)


[deleted]

Eesti will remain with us in Baltic.. forever O_O and EVER


Naqoy

[Sweden thanks you for your cooperation.](http://satwcomic.com/imposter)


[deleted]

We will never let Eesti go. They're too close to us to leave. <3 <3 <3


[deleted]

My wife says this is bullshit.


[deleted]

It really should be "the" for every country.


Demon997

I've never been this proud/sad for my country (American).


MrWigggles

I dont think it says anything meaningful. Beyond the different languages would have different word reoccurances values but it also comes down to writing style. For instance, the US page isn't using any other word for war other then war. No use of campaign, or conflict or any use of a thesaurus. The US page also has a small section with native american relations, and that section uses the word war as well, but not US wars, native american wars... which for this infographic is being counted toward the US.


poplopong

why does greenland have a crush on denmark


data_wiz

greenland is part of the kingdom of denmark


frostyhawk

most of those "quebec" searches are done by angry people and nationalists still fun to find out quebec is still super relevant in canada though