This topic was automatically generated from Slack. You can find the original thread here.
Hey - I’m trying to (see if it is possible to) use selenium inside pipedream. There are a few threads on this in the community already, but they don’t seem very conclusive - Can Selenium be Used with Python in PipeDream? - #2 by user-2, Has Pipedream with Selenium and a Headless Browser Been Used to Webscrape and Login to a Website?. I’ve been guessing at routes forward - any pointers, or an explanation why this isn’t possible, are appreciated.
The main issue seems to be that pipedream doesn’t support having a pre-installed webdriver that selenium automates, as is normal practise. I’ve tried installing the Firefox webdriver that comes included with selenium, but run into this error:
'Read-only file system (os error 30)
This is the test I tried:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from [selenium.webdriver.common.by](http://selenium.webdriver.common.by) import By
from selenium.webdriver.firefox.options import Options
def handler(pd: "pipedream"):
# Reference data from previous steps
print(pd.steps["trigger"]["context"]["id"])
# Return data for use in future steps
head_options = Options()
head_options.add_argument('-headless') #uncomment this line to run in background
driver = webdriver.Firefox(options=head_options)
driver.get("http://www.python.org")
assert "Python" in driver.title
elem = driver.find_element(By.NAME, "q")
elem.clear()
elem.send_keys("pycon")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
driver.close()
return {"foo": {"test": True}}
I also tried using the chromeless PiPy package (chromeless · PyPI):
from chromeless import Chromeless
import os
os.environ['AWS_DEFAULT_REGION'] = 'eu-west-2'
# Define Selenium method
def get_title(self, url):
self.get(url)
return self.title
# Attach the method and call it
def handler(pd: "pipedream"):
# Reference data from previous steps
print(pd.steps["trigger"]["context"]["id"])
# Return data for use in future steps
chrome = Chromeless()
chrome.attach(get_title)
print(chrome.get_title("https://google.com")) # Returns Google
return {"foo": {"test": True}}
I figured there might be more success with this since pipedream runs on AWS lambda, but no - it gets authentication errors. I could potentially use pipedream to invoke a lambda function and use chromeless there, but that seems to be getting convoluted.
Is this doable? Can pipedream run selenium?