Every SEO professional, or simply someone wishing to increase their website positions in the search results has probably been looking for an effective way to obtain specific information about the page.
This information usually includes titles, meta descriptions, or H1 headers.
Until now, the process involved using specialized tools created for this purpose or simply scrutinizing the page code with the use of a programming tool available within the browser. However, recently it turned out that Google Sheets are a great solution for those who would like to automatically obtain the data but don't have time to learn web development.
Let's see how to use web scraping in Google Sheets!
What is IMPORTXML?
IMPORTXML is functionality provided within Google Sheets that can be used to extract particular pieces of website data. Google Support states that the feature gives users the ability to import various types of structured data, such as:
XML
HTML
CSV
TSV
RSS
ATOM XML
In a nutshell, using the IMPORTXML function will allow you to retrieve any structured data from selected web pages, even if you don’t know programming languages!
How to Use This Functionality to Obtain Data?
Using the function itself is very simple, you just need to provide two values. The first one is an exact link to a page from which you want to extract data. The second one is a query, which determines what type of data should be extracted. Just take a look at the following example:
The functionality distinguishes between various types of data that can be extracted, however, the most common pieces include:
SEO Title - For this, you need to type: //title
Meta description of a given subpage - //meta[@name='description']/@content
H1 headers on the page - to retrieve them, use the query: //h1
Links on the page - to retrieve them, use the query: //@href
Proper use of IMPORTXML in a tool such as Google Sheets can help you reduce the amount of work that has to be done manually, as it automates many tasks.
It's worth noting that the data is downloaded in real-time, so when any of the elements on the page changes, the spreadsheet is also modified, and you see updated information. The functionality is much more complex, but today we decided to mention only its most popular features.
Thanks to a properly prepared spreadsheet with the functionality and other formulas you can create a specialist tool, even if you don’t know programming languages such as Python.
Delante is an online marketing agency specialized in generating website traffic from search engines and paid ads (SEO / SEM). Over 80% of our clients are from rapidly growing e-commerce industry.
This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Strictly Necessary Cookies
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.