How to

How to detect languages of webpages in bulk using Google docs

By April 16, 2013 No Comments

There’s a neat little function in Google docs spreadsheets that detects language. It’s  =DetectLanguage(), yep, that simple.

Except that it’s only detecting the language of the text in a cell, so if we use it alongside =ImportXml and we extract the <title> of a webpage, we can be more accurate.

Your formula would then become =DetectLanguage(importxml(“,”//title”)). Heads up – you’re limited to 50 ImportXml calls per spreadsheet, so either copy and replace values or create a script that does this for you.

Below is a spreadsheet you can use to see what’s going on, you might need to make a copy from the File Menu. Also just be aware that the =DetectLanguage function returns ISO language code values.

Here is the link to the actual spreadsheet.


Author Dave

More posts by Dave