Can Pipedream be Used to Extract Specific Real Estate Data from a Website for a Beginner with No Coding Experience?

This topic was automatically generated from Slack. You can find the original thread here.

Hello sophisticated developers of the world!

I am a realestate developer based in Australia and I just discovered Pipedream tonight while watching random tech & AI videos on Youtube.

I wondered if Pipedream could be used to “scan” a local realestate website called https://www.realestate.com.au/ here and report data into Excel based on certain search parameters.

For example,
“Search for suburbs where the price of house or land >800m2 is being sold for <$1m and new townhouses are being sold for >$1.2m”

I know absolutely nothing about coding or really any of this stuff for that matter. I am a complete beginner in this space.

Is something like this possible with Pipedream, and if so, what is the best way for me to go about having it created?

Thank you very much!

Hey there!

Sergio here!! Sophisticated developer from Mexico. I collaborate with Pipedream by onboarding apps to the Pipedream marketplace.

I can confirm you that what you want to do is perfectly possible. I can imagine two approaches here.

One is that the real state website you use has an API available (which stands for Application Programming Interface… Think of an User Interface, but tou do the interfacing with code!) On Pipedream, you can connect anything that exposes a public API (meaning is open for the general public, not closed to the apps developers of said website or service). Second approach is that you scan for the data using an “scrapping strategy” via code. Scrapping is the act of parsing, extracting data programmatically without an API (meaning you need to “navigate” for the data programmatically, sometimes using a so called headless browser, meaning a browser that doesn’t have a UI and it’s,driven, controlled via code).

Either way you go, you’ll need to save the data into excel. Excel online has a an API, which it’s self is part of Microsoft Graph APIs a family of APIs accessing Microsoft products. Here are some scrapping you could start leveraging, and a link to Excel app integration at Pipedream

Scrapping be:

https://pipedream.com/apps/scrapingbee
https://pipedream.com/apps/google-sheets

Ok Excel not onboarded yet, but you can express interest here: https://github.com/PipedreamHQ/pipedream/issues/5379

You can request realestate.com.au here https://github.com/PipedreamHQ/pipedream/issues/new/choose (or any other app integration for that matter) I’ll check on it later though (having breakfast right now) :sweat_smile: btw, isn’t too late in Australia aka the west Island (or so says a kiwi friend I met like two years ago)

Hi Sergio,
Thank you I really appreciate your detailed response. I will have a look at some of the options you’ve mentioned here. Might take me a bit of playing around to understand how it all works and to put something together but I’m sure I’ll get there in the end. Thank you!

Yes it’s quite late here now, 1:20am approximately!

Hey I checked!

It turns out I’ve looked out for an Realstate.com.au app integration at Pipedream and apparently there is no official API from Realstate.com.au directly.

But there is out there a developer company named API Dojo, it’s a software company that “just” develops APIs. They offer the Realstate API at RapidAPI. RapidAPI is itself a martketplace where developers sell their APIs as a service *you pay certain amount of money and they credit certain amount of API calls). This is very normal as there is a whole API economy right now in the SaaS business.

Browse for Realstate.com.au at Api Dojo
https://apidojo.net/api-of-brands/

It’ll link to Rapid API: Realty in AU API Documentation (apidojo) | RapidAPI

Then you can head to Pipedream and connect your RapidAPI key via the RapidAPI app> https://pipedream.com/apps/rapidapi you-ll have to adjust for your realstate case.

Rapid API offers a free tier of 500 APi calls / month, so is good for you to start playing around.

Hi Sergio,
Wow I really appreciate you looking into this for me. Thank you!

Is there a way I can arrange assistance from an engineer with this to help me set it up?
This is my first time doing something like this and I’m a bit out of my depth on the technical aspects. In the mean time I’ll continue watching YouTube tutorials etc but thought I’d ask anyway.

Furthermore, with the site scrape request, I’m assuming once it’s set up it will look for certain fields of information and record that back into google sheets and then a further step would need to be taken within Google Sheets to set up a formula based on the conditions I’m looking for. For example, “suburbs with land <$1m AND townhouses selling for >$1.2m”
Or would with all be able to be collected and organised through pipedream without further data analysis in Google Sheets?

For arranging a assistance you can go to Support https://pipedream.com/support

And click on “Connect with a partner”

Regarding having some prebuild components that you could reuse (ex. Using the “realty in Au” API from ApiDojo) these could be done by the Pipedream components team. There’s a backlog on apps being creating components for.

Finally, you have your ideas right. It is common to copy data from a source data such as a website and sink the data to an staging database from which you query the data. Now, when it comes to Pipedream you could potentially use a feature called Data Store as the sink, and then query from there. Just be aware of limitations. In my experience though, I’ve done scrapping and saving to a third app specifically Excel, and Ms Sql Server. At the end it’ll greatly depend on the data size and the velocity you’d like when pulling the data for i.e. reporting, or visualization, charts and so on.

Fantastic. Much appreciated Sergio!

Posted thread to Discourse: Can Pipedream be used to extract specific real estate data from a website into Excel?

normally Im in another area that this (and from Sydney so i thought i would say hi)

firstly, I have a feeling that the API is related to rental and sale management, not scraping per say, the docs are here:

If the API supports, it would be easy to do. probably the things to think of are:
• do you want to track price changes ±
• what database do you want to use for this
• how fresh do you want the data
• who is going to use it and how
Domain looks to have a more structured API, I think then endpoint you want is this

Also, a quick eyeball on the API, you could also offer this:
• You could easily generate a report (say message in) with Historial sales and school catchment, you could also load this on your site if you wanted
• You could create an on call database of other properties for sale
Also, you could just create this on the fly so if you need the data you call the API and get the info, instead of storing in a google sheet.

Hi Christopher, thanks for getting back to me about this.

I am not to sure whats involved from a technical standpoint, but at the end of the day, what I’m trying to do is put in a prompt such as this:

“Show me suburbs where townhouses sell for >$1.2m AND houses/land in the suburb sell for <$1m.”

or

“Show me suburbs (in Victoria, Australia) where townhouses are selling $800-900k AND houses/land in the suburb have sold for $900k to 1.3m”

These are just examples, and its probably pretty clear what I’m trying to figure out.

If there was a function where you could enter in a range of pricing for each of the variables using a drop down menu or field to type into that would be perfect.

The purpose of this is to find which suburbs/areas are worthwhile inspecting further for construction purposes.

I’ve tried some other methods which include using chatGPT coding to builder python webscrapers which have worked to a point but not to a level which I could realistically use as there we’re too many incorrect items being scraped and not being sorted correctly in the csv file produced.

If you have any suggestions on how to do this, it would be greatly appreciated.

Thanks mate!

Posted thread to Discourse: Can Pipedream be used to scan a local real estate website and report data into Excel based on specific search parameters?

Hey Todd. Soz I have been away for a week.

I mean that is all possible to do, I would be using the API for that instead of scraping (if possible).

Looking at the endpoints (domain), there doesn’t seem to be a endpoint for say /VIC/ as a whole or to be able to range the postcode in a single call (although it could be a stringed array in the header), this look probably a good place to start (although not current listings maybe):

GET /v1/salesResults/{city}/

Cant really comment on the chatGPT code builder, soz.

I think what you are trying to achieve is def possible, I would be using their API where possible as its cleaner from a data perspective.