There’s been a few of you requesting a way to bypass the 50 importxml limit in Google docs so I’ve decided to release something publicly.

Click here to view the spreadsheet

Just make sure to sign in, then make a copy, then press the run button once to authorize the script. If the script doesn’t run, or isn’t there, see the section below.

How does it work?

Please keep in mind I AM NOT A PROGRAMMER, but I do ensure that my code works properly – so please be constructive with your feedback :)

The only way I could do this efficiently was to use a script to set up the ImportXml formula in the sheet. This means that I was never able to call importxml with the Sheet class, setFormula method and then replace the formula fast enough. Even if I did manage to copyvalues and clear the importxml formula from the cell, it would either timeout, result in errors or very rarely…work.

Another fun issue was that Google docs would store the results for importxml in cache, but would display N/A# when I ran through the first loop. WTF. Ok, so add in another loop and now it’s displaying the right results. Don’t ask, I have no idea, but it works.

The script isn’t authorizing, or it’s not there!

Yep, that can happen – here is the source code.

function bulkXml() {

  var sheet = SpreadsheetApp.getActiveSheet();
  var Num = Browser.inputBox("How many URLs do you need to scrape?");

  for (y=0;y<2;y++) {

    for (x=2;x-2 < Num;x++)  {

      var url = sheet.getRange(x,1).getValue();
      sheet.getRange(2,6).setValue(url);
      var xpathResult = sheet.getRange(3,6).getValue();    
      var counter = x -1;
      sheet.getRange("C4").setValue(" PLEASE WAIT...CURRENTLY FETCHING " + counter + " OUT OF " + Num);

      if (y===1){
        sheet.getRange(x,2).setValue(xpathResult);
        sheet.getRange("C4").setValue("PROCESSED " + counter + " OUT OF " + Num);
        SpreadsheetApp.flush();
      }

    }

  }

}

function clear() {
  var sheet = SpreadsheetApp.getActiveSheet();
  sheet.getRange("a2:b1000").setValue("");

}

Click on Tools > Script editor and copy paste into there. Make sure you save the script and then you should be good to go.

When I click on the button nothing happens!

I’ve assigned scripts to the buttons, but they sometimes get lost when you make a copy of the Google doc

Right click on the Run button, in the top right you’ll see a drop down arrow. Select assign script, then enter: bulkXml