Projects | Node.js, LinkedIn API, Google Programmable Search API, MongoDB | Dakota Systems Project - Using Google Custom Search API and Reverse IP Search

Simarpreet Singh
It was really impressive to see how much faster using the Google Custom Search API was when compared to using the Python web scraping bot. Getting experience using another API for application was great! Also got to conduct a reverse IP Address search and store all this collected data to MongoDB in the end.
2021/7/20
The New Solution
As mentioned in my earlier project where I created a Python Web Scraper and linked it up with Node.js that process was simply too slow. The new solution now is doing everything through Node.js to avoid the linking of Node.js and Python. To simulate a similar web scraping feature as the Selenium Python bot, the Google Programmable Custom Search API was used.
How It Works
Once the custom search has been set up via Google Developers I got the API key and hook it up to my Node.js app. Once this was done, I configured the programmable search to only look on LinkedIn for search results so results are narrowed down. After doing so, I made a call to the custom search with first and last name and then receiving a JSON object back of the results I looped and filtered through the results until a matching code was found. Once the matching code was found there were a few limitations that came up, however, the Google API was miles faster than the previous Python Web Scraping bot. As the project got more involved, I got a lot of exposure to how async functions work within Node.js and had a lot of fun using them.
Limitation
The programmable search doesn't actually scrape the web page of the LinkedIn profile. However, the Google API does return back valuable enough information such as the user's LinkedIn profile URL, profile picture URL, and profile snippet which contains the user's current job position and company.
Location
The Google API doesn't scrape the user's location from their LinkedIn profile so this was information that needed to be gained via a work around. The solution to this was to request the user's IP address from the Node.js app and then using a native NodeJS API for the GeoLite data from MaxMind find the user's location they are logging in from.
MongoDB
Finally, once all this information was gathered I linked up MongoDB to my Node.js application and made POST call to the MongoDB database to store all user information that was collected.