How To Find Broken Links/Images From Page Using Selenium WebDriver Example

If you remember, Earlier we learnt how to extract all links from page In THIS POST. Extracting all links from page Is not useful If you don't know all the links are working fine or some of them are broken links or supposing there are few broken Images links. How to find these broken links or broken Images from page using selenium WebDriver? This Is part of testing In which you need to
check status of links/Images -> 1) Link URLs are opening targeted page 2) Images display properly on page or not. If links are Incorrect then It will not work.

Finding each and every link from page and verifying It manually will take lots of your time. You will find many broken link checker tools online. You can perform same task using selenium WebDriver. Lets see example on finding broken links from single page.

In bellow given example, First of all I have calculated total number of links on page. Then extracted all links one by one and check Its response code by calling getResponseCode function. I have used apache Interface HttpResponse to get the response code of URL. If It Is 200, that means link URL Is not broken and working fine. But If response code Is 404 or 505 that means link or Image IS broken.

In bellow given example, I have used test page where one link and Img URL Is broken to show you practically how It will differentiate those links from valid links. Execute bellow given selenium WebDriver test example In your eclipse and verify result In console. Console result will show you status of link URL If It Is broken or not.

package Testing_Pack;

import java.io.IOException;
import java.util.List;
import java.util.concurrent.TimeUnit;

import org.apache.http.HttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class BrokenlinksTest {

 public static void main(String[] args) throws IOException {

  WebDriver driver = new FirefoxDriver();
  driver.manage().window().maximize();

  driver.get("http://only-testing-blog.blogspot.com/2013/09/testing.html");

  driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
  
  //Find total No of links on page and print In console.
  List<WebElement> total_links = driver.findElements(By.tagName("a"));
  System.out.println("Total Number of links found on page = " + total_links.size());
  
  //for loop to open all links one by one to check response code.
  boolean isValid = false;
  for (int i = 0; i < total_links.size(); i++) {
   String url = total_links.get(i).getAttribute("href");

   if (url != null) {
    
    //Call getResponseCode function for each URL to check response code.
    isValid = getResponseCode(url);
    
    //Print message based on value of isValid which Is returned by getResponseCode function.
    if (isValid) {
     System.out.println("Valid Link:" + url);
     System.out.println("----------XXXX-----------XXXX----------XXXX-----------XXXX----------");
     System.out.println();
    } else {
     System.out.println("Broken Link ------> " + url);
     System.out.println("----------XXXX-----------XXXX----------XXXX-----------XXXX----------");
     System.out.println();
    }
   } else {    
    //If <a> tag do not contain href attribute and value then print this message
    System.out.println("String null");
    System.out.println("----------XXXX-----------XXXX----------XXXX-----------XXXX----------");
    System.out.println();
    continue;
   }
  }
  driver.close();
 }

 //Function to get response code of link URL.
 //Link URL Is valid If found response code = 200.
 //Link URL Is Invalid If found response code = 404 or 505.
 public static boolean getResponseCode(String chkurl) {
  boolean validResponse = false;
  try {   
   //Get response code of URL
   HttpResponse urlresp = new DefaultHttpClient().execute(new HttpGet(chkurl));
   int resp_Code = urlresp.getStatusLine().getStatusCode();
   System.out.println("Response Code Is : "+resp_Code);
   if ((resp_Code == 404) || (resp_Code == 505)) {
    validResponse = false;
   } else {
    validResponse = true;
   }
  } catch (Exception e) {

  }
  return validResponse;
 }
}

Console output for above example execution will looks like bellow.


This way you can find broken links or Images from any page using selenium WebDriver.

11 comments:

  1. HI Aravind, Thanks for posting. I have one question. The above example shows how to identify broken links in one page, but how to see for entire site.??

    Can you please explain?

    ReplyDelete
    Replies
    1. You can not find entire site's broken links in single shot. For that you have to get all the URLs of the site and then you can use loop to open these URL's one by one and check broken link on page.

      Delete
  2. DefaultHttpclient is deprecated.
    So, we can use the below code.
    HttpClient client = HttpClientBuilder.create().build();
    HttpGet request = new HttpGet("http://mkyong.com");
    HttpResponse response = client.execute(new HttpGet(chkurl));

    ReplyDelete
    Replies
    1. Hi find,

      Passing the choir, where it can be defined

      Delete
  3. Hi Kalpana R,
    Can you please share full code for this scenario as i am new to selenium.

    ReplyDelete
  4. HI Kalpana,
    of the obtained URL's i found most of them are duplicated or repeating again, So can you tell me how to find non duplicate broken links in webpage to check the 200 response code ??

    ReplyDelete
  5. hi how to find broken images and links on a site on ALL PAGES .???

    ReplyDelete
  6. how to verify broken links and images on a site for ALL PAGES.

    ReplyDelete
  7. Hello Friend, How to use the TestNG assert statement for this verification. Any answer really appreciated.

    Thanks,
    Satya

    ReplyDelete
  8. Hi,

    I. The code we are using chkurl where it is defined and which URL we need to pass

    ReplyDelete