Automate Academy Tasks
Automate Desktop

Web Browser Automation: Extract Data

Chapter 7 | Automate Tasks

Learn how to use web browser actions to extract data from a website and write it to a file. Brigette Matz, Automate Trainer/Consultant, gives you a guided tutorial through the Automate actions needed to extract a dataset from a web browser and write it to a CSV file. In this video you’ll learn how to:

  • Use the Create Session, Extract Table, Get Value, and Write to File Actions to set up a task that automatically grabs data from a webiste
  • Extract web browser data to a dataset and write the dataset to a CSV file

Watch this chapter now to learn how to get started.

 

Transcript

Brigette:           In part two of website automation, we will perform two exercises to extract data from a website and write it to a file. We will start by extracting this product version table to a data set. We'll then write that data set to a CSV file.

                                First, we'll open our web browser actions to select the create session [00:00:30] action. We'll type in web in our search bar, and we'll drab our create session action. This allows Automate to continuously work with the same URL to interact with the website. Dragging the magnifying glass icon to create a green border around the webpage will ensure Automate is interacting with the correct window. Automate has found and input the URL for us there.

                                [00:01:00] After we create the session, we will use the extract table action to copy the table into a data set within Automate. Well make sure we're in browser session one, and we're going to drag our magnifying glass again to find that page. Then we're going to drag the HTML identifier around the webpage to locate the table. [00:01:30] Automate puts a blue border along the bottom of the table once it's found. You'll also see on the bottom right side of the screen that the table elements have been identified. You'll see that Automate has found the table by the HTML tag, as well as the attribute of table.

                                Then down under interaction, we will create and populate a data [00:02:00] set with the table data so Automate has a place to save the information that it's extracting. We'll just name this one DS_producttable. On the variables pane at the bottom of our window here, we can now see that DS_productstable exists. If we right click and inspect that data, however, there is nothing yet listed. We need to run the task and extract [00:02:30] the data table in order to populate that data set.

                                Go ahead and close this and we'll run our task here. Once that task is executed, we will inspect our data set again, and we're expecting our data set to include all of the information this table from name all the way down to that last date listed. We'll inspect our data set here. Notice here on [00:03:00] the data set that the columns are called column one, column two, column three, etc, and the true column names are listed on row one. We'll need to remember this for the next step in our task. You'll see we have a few products listed here as well from our table.

                                Next, we'll write the contents of the data set from Automate into a CSV file. In my search bar, I'm going to type data set. You'll see here [00:03:30] that once I type it out, I will be presented with our file system action to convert a data set to a CSV file. We're going to drag that over and then we're going to select the data set that we're converting, which is DS_productstable. Then we're going to key in the file path for the CSV that we're creating. Make sure here that you'll want to include the file type extension as well, so .CSV at the end of this one to [00:04:00] ensure that it gets written over.

                                Here under the advanced options, I want to deselect the include column names option because as we saw in our data set view, the column names actually started on row one and Automate set generic column names with numbers. Then we're going to leave the delimiter default to comma here. Let's run the task now. Went super [00:04:30] fast there, so we'll confirm it's been completed in our output pane. If we go ahead and open that folder, we'll see that our CSV now exists and we'll open that up to look at it a little further as well. We'll see here that the data has been copied over exactly as it was displayed in our data set, and that the first row here reflects the HTML column names from the original table.

                                [00:05:00] Our next exercise will be to extract a specific element off of this webpage. Rather than extracting an entire table, we are going to extract this set of text from the website and we're going to write that to a text file. Here we'll see that we have the create session step from our first exercise, so we can just continue to work with this website. We need to create a variable as a place holder [00:05:30] for that text once it's been extracted from our website, similar to what we did for the table where we extracted that to a data set. For now, we'll leave that value blank, though.

                                Next, we need to extract that value of text from the website, so we're going to type in web here in our search bar again, and we're going to use our get value action. Using browser session one again, we're going to select [00:06:00] that window with our magnifying glass and we're going to also use our HTML identifier to find the text. If I drag it around here, you can see that it's actually able to locate all sorts of individual fields on this page. I'll release my mouse button and wait for Automate to input those locators, the identifiers from that website so that it can find that text.

                                [00:06:30] Sometimes Automate needs a second to try to find that HTML element, but it's taking a little longer than I like here, so let's just go ahead and try it again. Let's get that blue border around the text we need. There we go, looks like it found something there. Then down here with our interaction in this case, we're going to be copying the text over and we're going to place that text into our variable. We're going to place it into the variable that we just created, and then after that we'll call [00:07:00] that variable to write the data to a text file.

                                Next, in our action search bar we're going to search for write, W-R-I-T-E, and we see a write to file action under file system. Here, I'll drag that over and I'm going to use the file folder icon to locate the destination folder. We'll find that under my web extract folder. I'm going to just copy that as text and then we'll paste it back in here. Then I'm just going to key [00:07:30] in the file extension from here. We'll just call this one webtext.txt. Then in the data to write field, we are going to call that variable, so we're going to toggle our expression builder here. We'll double click to select the variable. It will encapsulate our variable into parentheses there, indicating that it's found that variable. We'll just save it here.

                Lastly, we're going to run the task. Automate's going to copy the text from the website [00:08:00] right here, and then send it to our variable. Then we're going to check for a .TXT file in our folder here, and we'll see when that TXT file exists that our text will be copied over in there. The output pane here confirms the task is complete, so we'll go ahead and we'll look at our file. Open that up, and we'll see there that Automate has written the text from that website and saved the file as expected.

Ready for the next chapter?

Chapter 8: Web Browser Automation: Best Practices