According to Pekka's input, many companies that offer a public API explicitly prohibit web scraping in their terms of service. Therefore, there is a risk that making 4k GET requests to their site could result in being identified as a malicious user and potentially getting banned!
The API provided by the company follows a RESTful structure and appears to be straightforward and well-documented. It is advisable to focus on utilizing this API instead of resorting to alternative methods. A practical starting point, once you have obtained your API key, would be to create a UNIX script for conducting reverse phone number lookups. For instance, if you have a list of 4000 10-digit phone numbers stored in a plain text file with one number per line, you can construct a basic bash script like the one below:
#!/bin/bash
INPUT_FILE=phone_numbers.txt
OUTPUT_DIR=output
API_KEY='MyWhitePages.comApiKey'
BASE_URL='http://api.whitepages.com'
# Perform a reverse lookup on each phone number in the input file.
for PHONE in $(cat $INPUT_FILE); do
URL="${BASE_URL}/reverse_phone/1.0/?phone=${PHONE};api_key=${API_KEY}"
curl $URL > "${OUTPUT}/result-${PHONE}.xml"
done
After retrieving all the outcomes, you have the option to either analyze the XML data to identify matching businesses or simply search for the phrase The search did not find results
within each output file. This specific message from the WhitePages.com API indicates that no match was found. If the search using grep returns a positive result, it suggests that the business does not exist (or may have changed its contact number). On the contrary, a lack of this message implies that the business likely still exists (or another entity shares the same phone number).