Problem

We wanted a reliable dataset of worldwide brands containing key information such as name, location, company size, and website URLs. The challenge was that the data was locked inside a large brand directory, split across multiple listing pages and brand profile pages. Manually collecting this data was inefficient and impossible to scale.

Solution

To build a structured dataset, I decided to combine two approaches:

  1. Use Instant Data Scraper (a browser-based tool) to quickly pull brand names and profile URLs from the directory listing pages.

  2. Use a custom Python script to automate visiting each profile URL and extract additional details (location, size, and website links).

This hybrid method allowed me to scale data extraction efficiently while maintaining accuracy.

Implementation

  • Step 1: Extract Listing Data

    • Used Instant Data Scraper to crawl through listing pages.

    • Captured brand names + their respective profile URLs.

    • Exported this into a CSV file for further enrichment.

  • Step 2: Build Python Scraper

    • Wrote a script using libraries like requests, BeautifulSoup, and pandas.

    • Script looped through the list of profile URLs.

    • Parsed HTML to extract brand details (location, company size, and official website).

Get in touch with me to get the python scraper code.

Result

  • Successfully scraped thousands of brand records across different geographies.

  • Final dataset included:

    • Brand Name

    • Location

    • Company Size

    • Website URL

    • Profile Link (for verification)

  • Reduced manual work by 90% compared to traditional copy-paste research.

  • The dataset became a ready-to-use source for lead generation, competitor analysis, and marketing outreach.

View other case studies